Breast cancer represents the most common malignancy in women. It is estimated that 268600 US women were newly diagnosed with invasive breast cancer in 2019, and that 41760 US women died of breast cancer. Because of its incidence and clinical impact, early and accurate tumor detection with imaging is of utmost importance. Ultrasound, mammography, and magnetic resonance imaging (MRI) play a pivotal role in the diagnosis of breast lesions, with different levels of accuracy. Particularly, MRI has shown a greater sensitivity than mammography (92% vs 75%, respectively) and ultrasound (90% vs 39% and 49% of ultrasound alone or associated with mam-mography, respectively) for the diagnosis of breast cancer. Thanks to the ability to provide both morphologic and hemodynamic features, dynamic contrast-enhanced MRI (DCE-MRI) provides high sensitivity (over 90%) in the detection of breast cancer, although specificity for lesion characterization is still suboptimal (72%)[2,3]. DCE-MRI has shown high diagnostic value in detecting multifocal, multicentric, or contralateral disease not diagnosed on physical examination, mammography or ultrasound, recognition of ductal carcinoma in situ (DCIS), evaluation of treatment response to neoadjuvant chemotherapy, detection of occult primary breast cancer in patients with metastatic axillary nodes (the so-called “CUP syndrome”), and detection of cancer in dense breast tissue.
Recently, an increasing interest for the clinical utility of quantitative imaging is developing. In this scenario, radiomics is emerging as a promising tool for quantitative tumor evaluation. Radiomics allow to extract quantitative data from medical images that be combined to provide models for clinical decision support.
The purpose of this article in to describe the current application of radiomics in breast dynamic contrast-enhanced MRI.
CONCEPTS OF RADIOMICS ANLYSIS
Radiomics is a complex process that articulates into distinct steps, including: Acquisition of images, tumor segmentation, feature extraction, exploratory analysis, and model building. The first step of radiomics is acquisition of high-quality images. Potentially, all the radiologic techniques may be used for radiomics analysis. In the field of breast imaging, all the techniques (mammography, ultrasound, and MRI) have shown promising results in radiomics studies. Particularly, breast MRI is commonly performed using T2-weighted images acquired to characterized diseased tissue, diffusion-weighted imaging (DWI), and apparent diffusion coefficient (ADC) that have an important clinical role in the evaluation of breast lesions, and post-contrast dynamic imaging that are mandatory for the differentiation of benign and malignant lesions. Next step is the segmentation of the lesion (Figure 1), with selection of a region of interest (ROI) and delineation of the borders of its volume. The ROI selection process is not yet standardized and it is linked to high levels of variability between different studies, as it can include the whole tumor or single slice segmentation.
Figure 1 Examples of lesion segmentation in dynamic contrast-enhanced-magnetic resonance imaging in a 70-year-old woman with 3.
0 cm breast cancer lesion. A: Short tau inversion recovery; B: Diffusion-weighted imaging; C: Contrast-enhanced sequences.
Feature extraction may be performed with different radiomics software that are able to provide a large number of quantitative features. Quantitative radiomics features can be divided into morphological (basic features that describe the shape of the ROI and its geometric properties such as volume, diameter, sphericity), and statistical (calculated using statistical methods). These features can be further divided into first order (histogram-based) features that describe the distribution of voxel values without considering the spatial relationships (i.e. mean, median, skewness, kurtosis, and entropy); second order texture features that are obtained by calculating the relationships between neighboring voxels (i.e. grey level co-occurrence matrix, grey level run length matrix, grey level size zone matrix); and third order features that are obtained by statistical methods after applying filters or mathematical transforms to the images (i.e. wavelet transform, Laplacian transforms of Gaussian-filtered images).
The next and last step in the workflow is building the statistical radiomics model with the purpose to predict an outcome or response variables. Different models can be evaluated to predict a specific outcome or a response using a variety of classifiers.
APPLICATION OF RADIOMICS IN BREAST DCE-MRI
The emerging field of radiomics was applied to several breast imaging modalities[8,9]. Nevertheless, DCE-MRI was used in most studies but with heterogeneity in study designs related to magnetic field (1.5T or 3T), contrast media used, and software available to perform radiomics. In this review we will focus the following applications of radiomics in breast DCE-MRI: Characterization of breast lesions, prediction of breast cancer histological types, correlation with receptor status, prediction with lymph node metastases, prediction of tumor response to neoadjuvant systemic therapy (NST), prognosis and recurrence risks.
Characterization of breast lesions
Radiomics features extracted from multiple MRI sequences have shown to be helpful in establishing predictive models that could help differentiate between benign and malignant breast lesions. Several radiomics models were proposed with promising results, with most texture analysis performed on post-contrast T1-weighted images, alone or in association with other sequences (T2w and ADC maps).
Since the very first studies in literature, conducted on small populations analyzing different types of features extracted (dynamic, textural, spatio-temporal) from breast contrast-enhanced MRI, the dynamic subset revealed the best performance for the characterization of breast lesions for Fusco et al. Testing a multi-layer perceptron neural network classifier, with an automatic ROI segmentation or ROI classification, they found an accuracy for dynamic features subset of about 80%, with the major discrimination power in differentiating benign from malignant lesions found for “basal signal”, “sum of intensities difference”, “relative enhancement slope” and “relative enhancement” features.
Nie et al investigated the utility of breast lesions morphology and textural features for differentiating between benign and malignant lesions, with both manual and automated segmentation and performing diagnostic feature selection using artificial neural network. They found that among morphological features “Compactness” and “Normalized Radial Length Entropy” showed significant differences between the benign and the malignant groups, whereas among “Gray Level Co-occurrence Matrices” texture features, “Gray Level Entropy” and “Gray Level Sum Average” were significantly lower in benign compared to malignant lesions. Analyzing the diagnostic performance of individual and combined features the highest AUROC (0.86) was obtained combining the following 6 features: Compactness, NRL entropy, volume, gray level entropy, gray level sum average, and homogeneity. Entropy is an important feature associated with tumor aggressiveness. It represents one of the most reliable feature to distinguish malignant from benign lesions, with the irregularity of texture reflecting the tumor heterogeneity, and tumor aggressiveness[13-15]. Gibbs et al, testing texture analysis with the aim to characterize breast lesions, concluded that texture features of variance, sum entropy, and entropy were the most significant when discriminating between benign and malignant lesions.
Radiomics model of quantitative pharmacokinetic maps demonstrated a strong ability to discriminate between benign and malignant breast lesions, directly reflecting the physiological properties of tissues, such as vessel permeability, perfusion, and volume of the extravascular/extracellular space[13,17]. Nagarajan et al studied texture features extracted from the lesion enhancement pattern on all five post-contrast images, thus using a dynamic texture quantification approach. In this study, the highest AUROC (0.82) was achieved with texture features responsible for capturing aspects of lesion heterogeneity. Gibbs et al also assessed the efficacy of radiomics analysis with quantitative pharmacokinetic maps in small breast lesions (less than 1 cm). Their results showed that texture parameters calculated from initial enhancement, overall enhancement, and area under the enhancement curve maps offered similar discriminatory power in discriminating benign and malignant breast lesions, whereas texture features obtained from washout maps did not demonstrate any diagnostic value.
While many studies focused on discriminatory capacities of specific texture features extracted from combining quantitative pharmacokinetic parameters of DCE-MRI sequences, few studies used a multiparametric approach analyzing also feature extracted from other sequences, such as T2-weigthed and T1-weigthed imaging, diffusion kurtosis imaging, and ADC maps. The multimodal MRI-based radiomics model developed by Zhang et al demonstrated higher diagnostic ability for differentiating benign and malignant breast lesions [Area under curve (AUC) = 0.921], increasing the discriminatory power of radiomics features extracted from DCE pharmacokinetic parameter maps alone (AUC = 0.836). In particular, analyzing textural features included in the radiomics models, malignant breast lesions had higher entropy and nonuniformity than benign lesions. The multiview IsoSVM (hybrid isomap and support vector machine) model applied by Parekh et al to radiomics features extracted from multiparametric breast MR imaging at 3T, classified benign and malignant breast tumors with an AUROC of 0.91, sensitivity of 93%, and specificity of 85%. In this study, entropy features maps obtained demonstrated significantly higher entropy for malignant than benign lesions on post contrast DCE-MRI and ADC maps. The same authors developed a multiparametric imaging radiomics framework for extraction of first and second order radiomics features from multiparametric radiological datasets which provided a 9%-28% increase in AUROC over single radiomics parameters. Similar results were reported by Bhooshan et al, who found the better performance applying a multiparametric feature vector, with T2-weighted MRI textural features added to DCE-MRI kinetic ones.
Radiomics features extracted from unenhanced MRI sequences were also evaluated for the prediction of malignancies. In the study of Bickelhaupt et al an unenhanced, abbreviated DWI protocol (ueMRI), including T2-weighted, DWI, DWI with background suppression sequences, and corresponding ADC maps, was used to test three machine learning classifiers including univariate mean ADC model, unconstrained radiomics model, constrained radiomics model with mandatory inclusion of mean ADC. The last two radiomics classifiers were found to be able to distinguish benign from malignant lesions more accurately (AUROC of 0.842 and 0.851) than the mean ADC parameter alone (AUROC of 0.774). Nevertheless, the performance remained lower than that of the experienced breast radiologist using standard DCE-MRI protocol. ADC radiomics features reflect the heterogeneity of diffusion in tumors, relative to the cell density and the microenvironment distribution inside the lesion. Hu et al found that ADC radiomics score was more accurate than ADC values alone and they developed a prediction model based on ADC radiomics, pharmacokinetics and clinical features, which showed good diagnostic performance in differentiating benign and malignant lesions classified as BI-RADS 4. A radiomics model based on kurtosis diffusion-weighted imaging was evaluated by Bickelhaupt et al who conducted a multicentric and prospective study on BI-RADS 4 and 5 lesions, by using MRI scanners from different vendors, showing reliable results, with a real benefit for BI-RADS 4a and 4b breast lesions.
Finally, more recent studies are using DCE-MRI focusing their attention on peritumoural tissues inclusion during segmentation. Zhou et al found that the smallest bounding box, that included a small amount of peritumoral tissue adjacent to the tumor, had higher accuracy compared to tumor alone or larger input boxes.
Prediction of breast cancer histological types
Few studies employed radiomics models and texture analysis to distinguish between the heterogeneous histopathologic subtypes of breast cancer and entropy-based features from the co-occurrence matrix appear to be most crucial, with promising results. Invasive ductal (IDC) and lobular (ILC) carcinoma are the most common pathologic types. The different growth patterns may manifest with different heterogeneity of internal enhancement in DCE-MRI, and could be the basis to differentiate between these two histological types by means of textural analysis[14,26]. Holli et al found that the co-occurrence matrix texture features group was statistically significant different between ductal and lobular invasive cancers on DCE-MR images. Similar conclusions were reported by Waugh et al analyzing differences between IDC, ILC and in situ ductal carcinoma (DCIS). Chou et al investigated the potential role of radiomics in classifying DCIS nuclear grade and found that only one heterogeneity metric, surface-to-volume ratio from the “shape and morphology” metrics group, was significantly different between “high nuclear grade” and “non-high nuclear grade” DCIS.
Correlation with receptor status and molecular subtypes
Expression of Ki-67, estrogen receptor (ER), progesterone receptor, human epidermal growth factor 2 receptor (HER2) are crucial factors to differentiate breast cancers into four main molecular subtypes (Luminal A, Luminal B, Her2 over-expressing, and triple negative, TN) with different outcomes and therapeutic strategies. According to the molecular subtypes different strategies, including surgery, adjuvant or neoadjuvant therapies, can be undertaken[28-31]. Current assessment of molecular subtypes is mostly based on immunohistochemistry (IHC). When IHC is tested in tissue specimens obtained by needle biopsy, could be not totally representative of the entire tumor or provide inconclusive results due to insufficient material. In this setting, according to prior studies, DCE-MRI may provide information suggesting the molecular subtype of breast cancer. In 2018, the American Joint Committee on Cancer updated the breast cancer staging guidelines to add other cancer characteristics to the TNM system to determine a cancer’s stage, including receptorial status. When developing a treatment plan, a correct assessment of receptorial status is crucial. Several published studies revealed that rim enhancement, heterogeneous internal enhancement, and peritumoral edema are more frequently associated with TN than Luminal subtypes[34,35]. In the study of Blaschke et al HER2-enriched tumors showed the percent volume with > 50% and > 100% early phase uptake higher than Luminal A/B lesions at kinetic assessment. TN tends to be more frequently round in shape[32,37], Her2 cancers with smooth margins than other subtypes. Controversial results were reported for diffusion-weighted imaging, suggesting that high ADC values are associated with HER2 subtypes or with Luminal A, and for spectroscopy, suggesting that high values of tCho are statistically correlated to the TN subtype for some authors[39,40], and with non-TN and Luminal B.
Several studies investigated the relationship between radiomics MRI features and breast cancer receptor status[42-44]. Wu et al reported only few features significantly associated with Luminal A, Luminal B or TN in their study cohorts for distinguishing different molecular subtypes of breast cancers. Radiomics analysis conducted by Li et al showed a statistically significant trend for the relationship between enhancement textures (entropy) and molecular subtypes in the task of distinguishing between ER+ versus ER−. Indeed, heterogeneous nature of contrast uptake within the breast tumor is related to molecular subtype. Similar observations were reported by Waugh et al, revealing that HER2-enriched and TN cancers showed a significant increase in entropy value. In the study of Chang et al the quantitative region-based features extracted from breast DCE-MRI were used to interpret the intra-tumoral heterogeneity and correlated with ER, HER2, and TNBC, with better performance than morphological features (texture features and shape feature) and the pharmacokinetic model. Fan et al investigated the use of features extracted from DCE-MRI for the prediction of the molecular subtypes of breast cancer and observed low kurtosis and skewness for the luminal A subtype, the highest enhancement values in the normal breasts for Her 2 subtypes and the lowest for luminal A and luminal B tumors. Furthermore, other studies suggested the value of the heterogeneity of the surrounding parenchyma, including background parenchymal enhancement features in differentiating TN breast cancers from others, as observed by Wang et al. The evaluation of both peritumoral and intratumoral features allowed to identify HER2 subtype with better accuracy than intratumoral features alone in the study of Braman et al. According to the results of Leithner et al radiomics analysis from DWI with ADC mapping allows evaluation of breast cancer receptor status and molecular subtyping. For differentiating ER positive breast cancer molecular subtypes (Luminal A vs Luminal B) the two most discriminative texture parameters extracted from the dynamic T1-weighted sequences by Holli-Helenius et al were sum entropy and sum variance, which also showed positive correlation with higher Ki-67 index.
High Ki-67 expression is a well-known prognostic factor, related to better neo-adjuvant therapy response but poorer prognosis. Assessment of Ki-67 based on immunohistochemistry on tissue specimens obtained by needle biopsy sample may not be representative of the whole tumor because of the relatively small tissue sample size and tumor heterogeneity. In the attempt to predict the expression of Ki-67 several studies have explored the potential of radiomics imaging features, with promising results. In their retrospective study, Ma et al showed that texture features extracted on the first post-contrast images were associated with breast cancer Ki-67 expression. Similar results were obtained by Juan et al. A correlation between Ki-67 expression and radiomics features were observed also performing features extraction from T2-weighted images and ADC maps.
Prediction of lymph node status
Involvement of axillary lymph nodes (LN) in patients with breast cancers represents a crucial prognostic factor, as it guides therapeutic management. Non-invasive methods to preoperatively evaluate LN metastasis are highly needed. Some promising studies suggested that radiomics models could be able to achieve this objective. In recent studies, specific lesions textural features extracted from anatomical and functional MRI images, improved the performance of radiomics models in predicting LN metastasis[57,58]. Liu et al demonstrated that DCE-MRI radiomics features, particularly features extracted from peritumoral regions, associated with clinico-pathologic informations were able to predict LN metastasis in breast cancer patients. Indeed, the area surrounding tumors, is thought to carry informations such as peritumoral lymphatic vessel invasion, lymphocytic infiltration, and edema[59,60]. Other authors reported that the best results were obtained when the features extraction was performed in the strongest phases of tumor enhancement, probably because it shows more clearly the lesion boundaries and better reflects the tumor heterogeneity and invasiveness. The radiomics nomogram developed by Han et al demonstrated excellent performance to predict LN metastases, and good ability in distinguishing the number of metastatic LNs. Similar performances were reported by several other evidences[59,63-65]. Finally, only very few studies evaluated texture analysis in identified index lymph nodes in postcontrast T1-weighted images, concluding that morphologic features were more predictive than kinetic and texture features[66,67].
Prediction of tumor response to neo-adjuvant therapy
NST is often the first line treatment for those patients diagnosed with locally advanced breast cancer, with several potential advantages, including the reduction of tumor size to allow breast-conservative surgery instead of mastectomy, as well as a prognostic indicator. The pathologic complete response (pCR) rate range from 0.3%–38.7%, depending on cancer subtype and breast cancer stage. Early identification of patients who are not likely to achieve pCR is crucial as they could benefit from changes to their initial NST regimens. DCE-MRI is considered as the most reliable technique for evaluating the responses to NST. According to a meta-analysis based on 25 studies, breast MRI had high specificity (up to 90.7%), but low sensitivity (63.1%) in predicting pathologic complete remission after preoperative therapy in patients with breast cancer. According to another recent meta-analysis, accuracy in detection of residual malignancy with breast MRI varies also in consideration of the treatment type, with AUC values ranging from 0.83 to 0.89, and on the basis of response definition, for instance volume reduction, absence of enhancement or enhancement equal or less than breast parenchyma[71,72]. The wide heterogeneity of studies, with controversial results, suggests to standardize definitions and primary endpoints to produce clinically significant results.
The identification of pCR is still a challenge and according with several studies, radiomics can be helpful in a non-invasive prediction of response to NST[74-78]. In most studies, GLCM features were the most predictive of therapy response, particularly entropy[79-81]. Noteworthy, in the study of Parikh et al, responders to NST showed increase in lesion homogeneity after one round of therapy. Cao et al demonstrated that texture analysis may help to improve the performance of post-NST MRI in identifying pCR in mass-like breast cancer, showing that entropy was an independent risk factor. Intratumoral spatial heterogeneity at perfusion MRI appeared to be an independent prognostic factor of recurrence-free-survival in patients with locally advanced breast cancers treated with NST. Significate differences between pCR and non-pCR patients were found for texture parameters also by Fusco et al. Peritumoral region includes prognostic informations, such as angiogenic and lymphangiogenic activity, peritumoral invasion of lymphatics and blood vessels and peritumoral lymphocytic infiltration. In their retrospective study, Braman et al demonstrated that with combined intratumoral and peritumoral radiomics approach, analyzing textural features extracted from T1-weighted contrast-enhanced MRI scans, it is possible to successfully predict pCR to NST from pretreatment breast DCE-MRI, both with and without a priori knowledge of receptor status. Later, the same authors, confirmed that an intratumoral and peritumoral imaging signature was capable to predict the response to preroperative targeted therapy in another retrospective study conducted on HER2-positive breast cancers, highlighting again the relationship between immune-response and the peritumoral environment. Zhou et al investigated the role of wavelet-transformed textures, which can provide compre-hensive spatial, and frequency distributions for characterizing intratumoral and peritumoral regions in terms of low and high frequency signals. In their study wavelet-transformed textures outperformed volumetric and peripheral textures in the radiomics MRI prediction of pCR to NST for patients with locally advanced breast cancers.
DWI is considerably sensitive to NST-induced intratumoral changes, resulting in an additional value when associated to contrast-enhanced MRI in radiomics models. Radiomics signatures combining multi-parametric MRI achieved a good performance for predicting complete response in BC, in both Luminal and TN cancers, in the study conducted by Liu et al. With a radiomics signature, combining radiomics features from DCE-MRI and ADC maps, Chen et al obtained similar results, with a higher performance than the models with DCE-MRI or ADC maps alone, in predicting PCR.
Sentinel lymph node biopsy has replaced axillary lymph node dissection in patients who convert to node-negative status after NST. Several studies assessed whether breast MRI can be used to assess lymph node residual metastasis after NST allowing breast cancer patients to avoid unnecessary axillary surgery. In the study of Hyun et al, DCE-MRI was able to rule out the presence of advanced nodal disease with a NPV of 94% in NAC patients. Nevertheless, in the work of Mattingly et al, post-treatment MRI and surgical pathologic findings revealed a slight strength of agreement and DCE-MRI revealed specificity and sensitivity of 63% and 55%, respectively. Ha et al found different results, with sensitivity and specificity of 57% and 72%, with positive estrogen receptor status significantly associated with misdiagnosis by MRI. These latter evidences, revealing that post-treatment MRI findings were not exactly predictive of residual axillary disease, suggest to use DCE-MRI results with caution when planning treatment and to avoid omitting sentinel lymph node biopsy or axillary lymph node dissection for staging in women determined to be node-positive pre-treatment. In this setting, convolutional neural networks (CNN), were employed to predict the likelihood of axillary LN metastasis and NAC treatment response, using MRI datasets prior to initiation of NAC in few studies with controversal results[79,94-96]. Ha et al reported an accuracy of 83% with AUC of 0.93 for CNN in predicting axillary response. Nevertheless, in the study of Golden et al the GLCM texture features extracted from pre- chemotherapy MRI was able to predict pCR and residual lymph node metastasis with an AUC of 0.68.
Radiomics models demonstrated promising results in predicting cancer prognosis of patients with tumors of various organs, reporting that several texture features, such as uniformity and entropy, can be used in risk stratification[15,97,98]. By using the genomic-based scores for the multigene assays MammaPrint, Oncotype DX, and PAM50 as the reference standards, Li et al demonstrated that breast MRI radiomics show a promising role for image-based phenotyping in assessing the risk of recurrence. Noteworthy, enhancement texture features were consistently associated with recurrence score, highlighting how microvascular density and/or central necrosis, responsible of tumor heterogeneity, play an important biological role in recurrence. Other authors confirmed these results, finding that tumors with higher entropy on T2-weighted images and lower entropy on T1w subtraction images were associated with poorer recurrence-free survival. A CNN developed by Ha et al was able to predict with an accuracy of up to 84%, the Oncotype Dx Recurrence Score (ODRS), an expensive but validated recurrence score, recommended by American Society of Clinical Oncology guidelines to decide on adjuvant systemic chemotherapy in ER+/HER-/node negative lesions. Nevertheless, this result was not confirmed by Saha et al, who tested two machine learning-based models, finding only a moderate association between imaging and ODRS. The study of Park et al was the first performed, using ROIs drawn on entire tumors, to demonstrate that a radiomics signature can estimate survival in patients with BC. They generated a multivariate feature vector based on morphologic, histogram texture, and GLCM texture features to stratify patients at risk for recurrence. They also showed that a combined radiomics-clinical-pathological nomogram achieved superior prognostic performance than either the Rad-score-only or the clinico-pathological nomograms. Nevertheless, controversial results were recently reported applying radiomics models to predict prognosis for TN (triple-negative) breast cancers[104,105]. While in the study conducted by Kim et al the radiomics score was significantly associated with worse disease free survival, but comparable in performance with the clinico-pathologic model, in both the training and validation sets, the work performed by Koh et al showed that their Radiomics model was able to predict systemic recurrence better than the Clinical model only in the training set.