Gene Expression Profiling of Carcinoma Breast and its, Prognostic Signature: A Review

Breast cancer research and clinical care have reached a new era due to the development of modern genetic technologies. In the past, breast cancer diagnosis, prognosis and treatment decisions were based on, clinical and pathological analysis of the breast cancer tissues and axillary lymph nodes. It has been observed that the prognosis and recommendation about treatment based on these features, imperfectly predict the outcome and results in excessive treatment and chemotherapy, with marginal benefits. Hence new methods are needed to understand breast cancer properly, to optimize and individualize the breast cancer treatment and prognosis. The development of the gene microarray techniques has enabled scientists to detect the different gene expression array of thousands of gene simultaneously and thus create a gene expression profile for different types of breast cancers. Gene expression profiling of the breast cancers has improved our understanding of the heterogeneity of breast cancer on the genomic level; challenged the clinical classification of breast cancer; served as an important prognostic indicator and most importantly, begin to guide our treatment in women with early breast cancer. Incorporation of molecular assays into the treatment and planning strategy of breast cancer continues to be a work in progress. This approach is evolving quickly due to strong scientific pieces of evidence to become a standard of practice in the near future. This article provides an overview of the development and application of molecular assays as applied to breast cancer.


Introduction
Breast cancer is an umbrella designation which includes different tumor subtypes. These subtypes differ in their prognosis and their response to treatment. [1] The traditionally used "one size fits all" approach has important limitations. These include. [2] 1. Drugs may be administered at suboptimal doses to drugresistant patients. 2. Tumors highly responsive to medication may receive additional unnecessary treatment. 3. If we could identify the most responsive subset of patients, appropriate and cost-effective treatment can be instituted. The heterogeneity of the outcome and drug sensitivity has continued the drive, to discover the second generation of molecular predictors. These have been developed using DNA based array or reverse transcriptase polymerase chain reaction (RT-CPR) and are expected to improve the quality of care for patients with breast cancer. Gene expression is a measure of gene activity, which is determined by the number of times, it is, transcribed into mRNA and finally, by the protein it encodes. The gene expression is captured by DNA microarray technology or RT-PCR and is called a transcriptome. List of a gene associated with prognosis, response to various treatment at phenotype is called "gene profile" or "gene signature".

Gene Profiling platforms
The four major platforms used for gene profiling are:-1. Immunohistochemistry (IHC). 2. Fluorescent in situ hybridization (FISH). 3. Quantitative reverse transcriptase polymerization chain reaction (qRT-PCR). 4. Quantitative cDNA microarray. All platforms use formalin fixed, paraffin-embedded tissues except cDNA microarray which preferentially uses fresh frozen samples only, thus limiting its widespread application to the population at large. [3][4][5][6][7][8][9][10][11][12][13][14][15][16] Our understanding of the biology, underlying the breast cancer, has considerably improved due to the ability to measure concurrently the expression levels of tens of thousands of genes in cell line and tumor specimen. The Brown & Botstein lab at Stanford was first to publish their studies which showed that cDNA (complementary DNA) microarray could identify expression signature specific to breast cancer cells. [17][18] A similar study showed that the gene expression patterns can predict the invasive capacity of breast cancer lines. [19] It also gives the biological classification of breast cancer on the basis of a distinctive pattern of gene expression. Their molecular interpretation subsequently validated by studies from Sorlie et al. [20][21][22] Sorlie et al correlated it, with patient's outcome and recognized the known clinically important subgroups and also indicated the existence of novel subtypes of breast carcinoma. Although the inherent variability of breast cancer has been recognized for decades, only recently, the gene expression profiling has demonstrated the heterogenicity of breast cancer, at the genomic level. The study of Perou et al (yr 2000), described six intrinsic subtypes of breast cancer using unsupervised analysis of gene expression profiles of 65 breast cancer tissues. [23] A set of 456 genes were identified as being associated with the differences among these breast cancer tissues. Initially, six molecular subgroups were identified, Luminal A, B, and C, HER2+/ER-, Basal-like, and normal breast-like group. [24] With time it became apparent that the luminal C subtype is seldom seen, and it is not certain, whether, the normal breastlike group truly represents cancer tissue or was merely a sampling error of the benign breast tissue embedded in the cancer breast tissues. The remaining four in the intrinsic subtype of breast cancer, luminal A and B, HER2+/ER-, and basal-like are reasonably widely accepted at this time. It is reassuring that the intrinsic subtypes correlate with the known pathological characteristics of the breast cancers, especially the estrogen receptor alpha (ER) status and immunological features. Luminal A and Luminal B subtypes express ER protein, whereas HER2+/ER-and basal-like subtypes are ER-negative. Subtyping breast cancer, into, luminal or basal-like groups also correlates, with pathological staining characteristics. Luminal tissues stain positive for keratins 8 and 18, whereas basal-like tissues stain positive for keratins 5 and 17. The importance of the intrinsic gene sub typing set has been validated through its application to available data sets. [25][26][27] it has been demonstrated that each subgroup has different clinical outcomes. Luminal A patients have the best overall survival and disease-free survival while Luminal B and HER2+/ERpatients have an intermediate outcome, and the basal-like patients do the worst. [28,29] Table1: Classification of Molecular Subtypes. [ Luminal tumors respond well to hormone therapy but poorly to conventional chemotherapy. [30] Luminal A tumors could be adequately treated with endocrine therapy, while luminal B tumors which are more proliferative may benefit more from the combined therapeutic strategy of chemotherapy and hormonal treatment. The other targeted approaches such as anti-Angiogenic therapy, like, anti-VEGF antibodies (Bevacizumab), can also improve progression-free survival in metastatic breast cancers.

The HER2 over expression tumors:
The intrinsic HER2 over expression tumors refer to those which are ER negative, PR negative & HER2 positive. Their HER2 over-expression is characterized by over expressing other genes in the HER2 amplicon such as GRB7 and PGAP3. [31,32] The TP53 mutations are harboured in these tumors, in almost 40% to 80% of cases. HER2 overexpression tumors are more likely to be grade-3. In HER2 over expression tumors, no association was found with age, race or any known risk factors. [33] Though HER2 over expression breast tumors carry a poor prognosis they are sensitive to Anthracyclin and taxane-based neoadjuvant chemotherapy. They also show more significantly higher pathologically completed response, than luminal breast cancers. [34] The basal subtype: The basal subtype is composed of Triple negative, (ER-PR-HER2-) tumors with expression profiles mimicking that of the basal epithelial cells of other parts of the body and normal breast myoepithelial cells. Such expression patterns include, lacking or low expression of hormone receptors and HER2, and high expression of basal markers (such as keratins 5, 6, 14, 17, EGFR) and proliferation-related genes. Tumors characterized by basal cytokeratins expression are more probable to have low BRCA1 expression and harbour TP53 mutations. Similar to HER2 over-expression tumors, basal cancers are likely to be of grade 3 tumors. Basal tumors account for 60% to 90% triple-negative breast cancer cases. These tumors are of particular interest because they follow an aggressive clinical course and currently lack any form of standard targeted systemic therapy. Compared with other subtypes, these tumors are associated with younger age, more common in African-American women, and especially, among pre-menopausal individuals. Risk factors for this subtype include earlier menarche, high waistto-hip ratio, and a lack of breastfeeding together with high parity. [35] These tumors are associated with lower diseasespecific survival and a higher risk of local and regional relapse. The metastatic pattern shows a tendency towards visceral organs (excluding bone) and less likely to involve lymph nodes. [36] Given the triple negative receptor status, basal tumors are not amenable to conventional targeted breast cancer therapies, leaving chemotherapy the only option in the therapeutic armamentarium. The prognostic signature: A prognostic factor is any factor present at the time of initial diagnosis (in the absence of systemic adjuvant treatment) that correlates with the natural history of the disease. The prognostic factors may be correlated with the disease-free interval or with the overall survival. Since decades the oncologists have evaluated the prognosis of breast cancer patients based on such clinical data as tumor size, axillary nodes status, and nuclear grade. Subsequently, ER-positive and ER-negative tumors, which have different outcomes, were recognized. Histological grade, based on the mitotic index, nuclear pleomorphism, and architectural differentiation is one of the most important prognostic factors for breast cancer, yet has only a moderate level of inter-observer agreement among pathologists. [37] Treatment options for early-stage breast cancer include chemotherapy, endocrine therapy and Trastuzumab plus chemotherapy (in patients with HER2 over-expression). Overtreatment which is associated with adverse effect and cost is common in adjuvant settings. Traditionally, oncologists choose adjuvant therapy based on such pathological factors as tumor size, tumor grade, and nodal status, as-well-as patients related factors, such as age, menopausal status, and medical co-morbidities. However, patients with the same clinic-pathological parameters and biomarkers can have different outcomes. It is important to investigate whether new genomic advances may help to predict the natural history of breast tumors and to improve upon our ability to predict clinical outcome in patients. Currently, the use of genomics in clinical practice can provide valuable information about the potential benefit of receiving chemotherapy versus endocrine therapy on ER positive, node negative and HER2 negative patients. In this group of patients, genomic testing can be used with the objective of minimizing overtreatment while improving disease-free survival (DFS) and overall survival (OS). [38] With the help of gene expression profiling, at least six additional well-characterized prognostic models have been developed and some of them are moving into the clinical settings. These six profiles are-intrinsic subtype, 70 gene profile,76 genes prognostic classifier, wound response, 21 gene recurrence score, and two gene ratio prognostic model. Among the first four can only be derived using RNA from fresh frozen tissues whereas the last two can be done on fixed archived tissues.
The Gene Expression profile, prognostic Models: 1. Intrinsic subtype: Parker et al, [39] proposed a risk model incorporating the gene expression based intrinsic subtype i.e. Luminal A, Luminal B, HER2 enriched and basal likes. This model was based on microarray technology and RT-PCR. The investigators described a 50 gene subtype predictor of prognosis in untreated patients and of pathological complete response in patients receiving neoadjuvant treatment. The study population included 761 patients who did not receive chemotherapy and 133 patients who received Anthracyclin or taxane-based treatment. A risk of relapse (ROR) was proposed for each patient. ROR was based on tumor subtype (ROR-S) or was associated with tumor size (ROR-C) as a predictor factor for the pathological response with neoadjuvant treatment. This method allows the evaluation of the relationship between the genomic or intrinsic classification and clinical response. For this reason, PAM-50 platform is currently being used in clinical trials. [40][41][42][43] It predicted that luminal A patients did the best, followed by Luminal B and HER2+ /ER-patients and basal-like patients did the worst.

70 gene profile:
The 70 genes prognostic signature is developed by the Netherland Cancer Institute from the primary breast cancer cases. [44,45] The signature showing prognostic value for distant metastasis within 5 years was first identified using a cohort of 76 node-negative breast cancers occurring in women below age 55 who had not received systemic adjuvant therapy using oligonucleotide microarrays. This signature included mainly the genes involved in the cell cycle invasion, metastasis, angiogenesis, and signal transduction. This gene profile was then validated on a larger set of 295 young patients including both node negative and positive breast tumours in treated and untreated patients from same institution, and proved to be the strongest predictor for distant metastasis-free survival, independent of adjuvant treatment, tumour size, histological grade and age, both in node-negative and node-positive patients. The prognostic value of the 70 gene signature was significantly better at predicting distant relapse-free survival than standard St Gallen or National Institute of Health clinical criteria.

76 gene prognostic classifier:
Wang et al, [46] used Alfymetrix 4133 A array platform to develop a 76 gene Rotterdam signature. It was developed to predict distant relapse rate in 115 breast cancer, nodenegative patients of all age group. This study builds a classification algorithm that considered ER-positive patients separately from ER-negative patients, taking into account that the mechanisms for the disease progression could differ for these two ER based sub-groups of breast cancer patients.

Agrawal: Gene Expression Profiling of Carcinoma Breast
The 76 gene signature was mainly associated with cell cycle and cell death, DNA replication and repair and immune response. In the same study, they validated the prognostic ability of this signature in an additional set of 171 nodenegative untreated breast cancer patients. Recently, this same group provided additional evidence for the prognostic performance of their predictor in a multicentric cohort of 180 node negative untreated breast cancer patients obtained from different institutions. [47] 4. Wound response: Chang et al, [48] demonstrated that wounds share many features with tumor; they identified a wound response signature gene whose gene appeared to be co-ordinately regulated in many human tumors including breast cancer. They also found that breast cancer patients whose tumors were expressing the wound response signature have a markedly worse clinical outcome. [49] They demonstrated that their signature improved current risk stratification based on the NIH and St. Gallen guidelines and that it was able to identify a subset of low-risk patients within the clinical highrisk group. Altogether, their results pointed out a strong link between wound response and cancer behaviour on the genomic scale and also suggested that this signature would be a clinically useful tool for recognizing the cancers at high risk of progression at an early stage. These studies provided an experimental model of wound healing that could be used to study the underlying mechanisms and as a basis for developing inhibitors to the response. It is believed that an active wound healing genetic profile predicts increased risk of metastasis and death in patients with breast, lung and gastric cancers. [50]

21 gene score:
It is the most widely used gene expression profile prognostic model in the U.S. It was developed through evaluation of 250 genes that could putatively correlate with breast cancer recurrence based on existing literature, database, and experimental evidence to identify a 21 gene including 16 cancer-related genes (it includes HER2, ER-related gene and proliferation gene also) and five reference genes. Their combined expression was then assessed in tumor tissue derived from women who had previously participated in adjuvant therapy trials with known outcomes. It was possible to derive a recurrence score reflecting prognosis for women with node-negative, ER-positive breast cancer receiving tamoxifen.
The recurrence score was reported on a scale from 0 to 100, with low risk defined as a score less than 18, high risk defined as a score greater than 30 and intermediate risk reflecting a score of 18-30. A useful attribute to this assay is that it does not require frozen tissue and can be done on fixed tissues. It was validated through its application to 668 out of 2,617 tamoxifen-treated patients' samples collected in the NSABP, B-14 trials which examined the benefit of tamoxifen in hormone receptor positive, node negative breast cancer. In these women who received only tamoxifen, 6.8% of patients in the low recurrence score group had distant recurrence at 10 years as compared with 14.3% in the intermediate risk group and 30.5% in the high-risk group. The difference in the low risk and the high-risk group is statistically significant with p-value <0.001. Multivariate analysis revealed that the power of the 21 gene recurrence score was independent of age and tumor size. [56] Another study demonstrated that the 21 gene set is more accurate in predicting outcomes than adjuvant! Online. [51] The predictive power of the 21 gene recurrence score was further validated in predicting breast cancer-related mortality. [52] and responsiveness to chemotherapy and hormonal therapy. [53,54] Based on the strong evidence of its prognostic and predictive power, the 21 gene recurrence score has been cleared for commercial use in an essay known as OncotypeDx.®

Two gene ratio model:
It was developed in 60 ER positive, early stage breast cancer patients treated with tamoxifen. The ratio of the expression of homeobox 13 and interleukin 17B was used to predict disease-free survival in those patients. Higher the ratio, predicts, worse clinical outcome. It may be useful to identify estrogen receptor positive early breast cancer patients who have a poor outcome with tamoxifen and could possibly benefit from additional therapy rather than tamoxifen. [55]

Fixed
To identify women with node-negative, hormone-receptor-positive breast cancer who would benefit from addition of chemotherapy to tamoxifen 76-gene prognostic signature

Fresh
To predict disease free and overall survival in patients with node-negative early-stage breast cancer who have not received systemic therapy Wound response

Fresh
To predict increased risk of metastasis and death in patients with breast cancer Two-gene ratio Fixed To identify early stage steroid receptor positive breast cancer patients who would benefit from addition of chemotherapy to tamoxifen Intrinsic subtype

Fresh
To predict clinical outcomes of breast cancer patients

Discussion
Gene expression profiling is enabling scientists to understand the heterogeneous nature of breast cancer on the genomic level. Several gene expression profiles for breast cancer have emerged in the initial studies and appear to be generally concordant in their ability to predict poor outcome. Of these profiles, the OncotypeDx® and Mammaprint® assay are the best validated and are commercially available. Their role in clinical practice is being refined through ongoing clinical trials. Other efforts are directed at determining host factor that might help to identify prognosis as well as response and toxicity to therapy. The emergence of prognostic (associated with the clinical outcome) and predictive (associated with response to therapy) gene expression signature holds promise for attempts to individualized breast cancer treatment.

Agrawal: Gene Expression Profiling of Carcinoma Breast
Although most gene signatures provide valuable information for classifying breast cancer tumors and have consistently predicted clinical outcome, the challenge is how to integrate the genetic information to a prognostic model that could easily be applied in a clinical setting. Breast cancer diagnosis and treatment decisions will continue to rely largely on classical histopathological and clinical parameters until some crucial issues are resolved.
The crucial issues: 1. How can we compare and eventually integrate the information from these different signatures that have been identified to optimize risk stratification, for breast cancer patients? 2. Would a combined approach of clinical and genetic data increase clinical outcome prediction? 3. Are the technologies routinely applicable and reproducible? 4. Finally identifying the high-risk patients, who would already need systemic adjuvant therapy? We still do not know which therapy will be most efficient for the individual patient. Indeed identifying markers that could predict response to a particular drug remains a great challenge for the medical community, as commonly used therapeutic agents are, ineffective in many patients and side effects are common.
The pitfalls of genomic studies: 1. Several studies have already used a genome-wide approach in order to identify gene expression profile that correlates with chemo or hormone sensitivity, [56][57][58] Although result supports the concept that the predictors of the anti-cancer drug can be developed, they remain sub-optimal. It is due to a small sample size that used to build and validate these gene predictors putting their robustness in question. 2. Many studies suffer from methodological limitations such as 2.1. Choice of end-point (clinical versus pathological response) 2.2. The choice of region to be studied (combination chemotherapy as compared to a single agent) 2.3. The type of population to be evaluated (e.g. the whole breast cancer population as opposed to a relevant molecular subgroup). Thus evaluating a predictor in an inappropriate cohort might lead to underestimation of its performance. [59,60] Thereby, once predictors are identified, it is always appropriate to investigate whether these just correlates with the natural history of the disease, predicts response to Cytotoxic agent in general or really specific for a particular class of anti-cancer drug. At this juncture, we are at a transition between empirical and molecular medicine. However, if we want 'tailored' breast cancer management to become a reality, we need adequate validation of the predictors in prospective clinical trials, such as the MINDACT trial. Finally, no matter how sophisticated and thorough a microarray analysis may be, there is a stochastic component to a patient outcome that will prevent any prognostic model to become perfect because there is certain unavoidable randomness to fate.