Wei X, Yan XJ, Guo YY, Zhang J, Wang GR, Fayyaz A, Yu J. Machine learning-based gray-level co-occurrence matrix signature for predicting lymph node metastasis in undifferentiated-type early gastric cancer. World J Gastroenterol 2022; 28(36): 5338-5350 [PMID: 36185632 DOI: 10.3748/wjg.v28.i36.5338]
Corresponding Author of This Article
Jiao Yu, MD, Radiologist, Department of Radiotherapy, Shaanxi Provincial People’s Hospital, No. 256 Youyi West Road, Beilin District, Xi’an 710068, Shaanxi Province, China. firstname.lastname@example.org
Checklist of Responsibilities for the Scientific Editor of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Author contributions: Yu J and Wei X conceived and designed the study and wrote the manuscript; Yan XJ, Guo YY, Zhang J, Wang GR, and Arsalan F collected the data, performed the data analysis, and interpreted the outcomes; and all authors critically reviewed the content of the manuscript and helped with the drafts.
Supported bythe General Project-Social Development Field of Shaanxi Province Science and Technology Department, No. 2021SF-313; and Innovation Capability Support Plan of Shaanxi Science and Technology Department - Science and Technology Innovation Team, No. 2020TD-048.
Institutional review board statement: This study was approved by the Institutional Review Committee of Shaanxi Provincial People’s Hospital (2021-Y024).
Informed consent statement: Written informed consent was not required given the retrospective nature of the study from chart review.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Data sharing statement: No additional data are available.
STROBE statement: The authors have read the STROBE Statement-a checklist of items is provided. The manuscript was prepared and revised according to the STROBE Statement-a checklist of items is provided.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Jiao Yu, MD, Radiologist, Department of Radiotherapy, Shaanxi Provincial People’s Hospital, No. 256 Youyi West Road, Beilin District, Xi’an 710068, Shaanxi Province, China. email@example.com
Received: July 20, 2022 Peer-review started: July 20, 2022 First decision: August 6, 2022 Revised: August 14, 2022 Accepted: September 6, 2022 Article in press: September 6, 2022 Published online: September 28, 2022
The most important consideration in determining treatment strategies for undifferentiated early gastric cancer (UEGC) is the risk of lymph node metastasis (LNM). Therefore, identifying a potential biomarker that predicts LNM is quite useful in determining treatment.
To develop a machine learning (ML)-based integral procedure to construct the LNM gray-level co-occurrence matrix (GLCM) prediction model.
We retrospectively selected 526 cases of UEGC confirmed through pathological examination after radical gastrectomy without endoscopic treatment in four tertiary hospitals between January 2015 to December 2021. We extracted GLCM-based features from grayscale images and applied ML to the classification of candidate predictive variables. The robustness and clinical utility of each model were evaluated based on the following factors: Receiver operating characteristic curve (ROC), decision curve analysis, and clinical impact curve.
GLCM-based feature extraction significantly correlated with LNM. The top 7 GLCM-based factors included inertia value 0° (IV_0), inertia value 45° (IV_45), inverse gap 0° (IG_0), inverse gap 45° (IG_45), inverse gap full angle (IG_all), Haralick 30° (Haralick_30), Haralick full angle (Haralick_all), and Entropy. The areas under the ROC curve (AUCs) of the random forest classifier (RFC) model, support vector machine, eXtreme gradient boosting, artificial neural network, and decision tree ranged from 0.805 [95% confidence interval (CI): 0.258-1.352] to 0.925 (95%CI: 0.378-1.472) in the training set and from 0.794 (95%CI: 0.237-1.351) to 0.912 (95%CI: 0.355-1.469) in the testing set, respectively. The RFC (training set: AUC: 0.925, 95%CI: 0.378-1.472; testing set: AUC: 0.912, 95%CI: 0.355-1.469) model that incorporates Entropy, Haralick_all, Haralick_30, IG_all, IG_45, IG_0, and IV_45 had the highest predictive accuracy.
The evaluation results indicate that the method of selecting radiological and textural features becomes more effective in the LNM discrimination against UEGC patients. Additionally, the ML-based prediction model developed using the RFC can be used to derive treatment options and identify LNM, which can hence improve clinical outcomes.
Core Tip: Gray-level co-occurrence matrix-based feature extraction can be a robust and promising tool to improve the efficiency in predicting lymph node metastasis of individual undifferentiated early gastric cancer patients. Additionally, machine learning adopts more optimized algorithms and more clear feature extraction. Models developed using random forest classifier have the highest predictive accuracy in terms of Entropy, Haralick full angle, Haralick 30°, inverse gap full angle, inverse gap 45°, inverse gap 0°, and inertia value 45°. Further research is required to develop these models for clinical practice.
Citation: Wei X, Yan XJ, Guo YY, Zhang J, Wang GR, Fayyaz A, Yu J. Machine learning-based gray-level co-occurrence matrix signature for predicting lymph node metastasis in undifferentiated-type early gastric cancer. World J Gastroenterol 2022; 28(36): 5338-5350
Gastric cancer (GC) is one of the most common and fatal malignancies worldwide and is an important part of the global cancer burden[1,2]. In GC, undifferentiated early GC (UEGC) differs from differentiated-type GC in terms of clinical features and disease state, and their treatment and prognosis vary. Therefore, UEGC should be identified and diagnosed early.
The incidence of lymphatic vessel invasion and risk of lymph node metastasis (LNM) in UEGC are high in surgical specimens of GC[4,5]. Endoscopic resection (ER), including endoscopic mucosal resection (EMR) and endoscopic submucosal dissection (ESD), has been considered a minimally invasive treatment option for early GC with negligible risk of LNM[6,7]. Nevertheless, indication or curability evaluation has not been conducted for ESD of undifferentiated GC (e.g., poorly differentiated adenocarcinoma, signet ring cell carcinoma, or mucinous adenocarcinoma) due to the potential risk of LNM. Although ER can be used as painless treatment, the LNM incidence after non-curative ER can be as low as 5.1% and as high as 12.2%[8-10]. Additionally, ESD is only applicable to intramucosal cancer with a tumor diameter of ≤ 20 mm and without ulcer lesions; thus, treating lesions that meet the ESD indications through surgery is unnecessary[11,12]. That is, when resection beyond the expanded standard is considered ineffective, the potential risk of LNM cannot be ignored. Hence, additional surgical resection and lymph node dissection should be performed. Unlike differentiated early GC, ER indications of UEGC are limited. Therefore, to address this challenging problem, a precise tool that can predict LNM must be explored.
Previous studies have mainly focused on risk factors for LNM or distant metastasis of differentiated-type early GC[13-15]. However, for UEGC, LNM has different risk factors. Thus, objective and universal evaluation indicators for evaluating its risk are lacking. In this study, we clarified the LNM risk factors of patients with UEGC who underwent surgical resection. Subsequently, we analyzed clinical-pathological factors by introducing gray-level co-occurrence matrix (GLCM) image feature extraction mining to classify LNM risk groups according to the combination of risk factors. This study aims to provide a reference for clinical diagnosis and treatment.
MATERIALS AND METHODS
The clinical records of 526 patients who were diagnosed with UEGC were confirmed through pathological examination after radical gastrectomy without endoscopic treatment at four tertiary hospitals. These hospitals are Shaanxi Provincial People’s Hospital, Shaanxi Provincial Tumour Hospital, the First Affiliated Hospital of Xi’an Jiaotong University, and the Second Affiliated Hospital of Xi’an Jiaotong University. The clinical records were between January 2015 to December 2021 and were retrospectively reviewed. The following were the inclusion criteria: (1) Imaging examination was performed; (2) Patients have a complete set of medical data; (3) Primary lesion was resected either via open surgery or laparoscopic surgery and not via EMR or ESD; and (4) The status of infiltrating lymph nodes was assessed through routine hematoxylin-eosin staining. To minimize the confounding effect of unnecessary variables, the following were the exclusion criteria: (1) Sufficient information cannot be extracted or mismatched clinical data of patients; and (2) Patients without complete magnetic resonance imaging (MRI) plain scan or the MRI image quality being unacceptable. This study complies with the provisions of the Helsinki Declaration (revised in 2013) and was approved by the Institutional Review Committee of Shaanxi Provincial People’s Hospital (2021-Y024). Figure 1 presents in detail the patient screening steps and modeling process.
Figure 1 Flowchart of patient selection and data processing.
UEGC: Undifferentiated early gastric cancer; EMR: Endoscopic mucosal resection; ESD: Endoscopic submucosal dissection; RFC: Random forest classifier; SVM: Support vector machine; DT: Decision tree; ANN: Artificial neural network; XGboost: Extreme gradient boosting; ROC: Receiver operating characteristic; DCA: Decision curve analysis; CIC: Clinical impact curve; LNM: Lymph node metastasis.
Construction strategy of the GLCM
All texture parameter post-processing was conducted on Omni dynamics software (GE pharmaceuticals, Shanghai). Two radiologists who have vast experience in gastrointestinal diagnosis referred to the MRI images to sketch the lesions on the ADC map. First, they manually sketch the entire area with cancer on each layer of the map, avoiding the gas in the intestine, until the whole tumor volume was cut out. Second, the software automatically generates the texture features. In this study, the following are the selected texture parameters of the GLCM: Total frequency, energy value, entropy, inertia value, correlation coefficient, inverse moment, cluster shadow, and cluster prominence.
Data extraction and quality assessment
For variables with missing values (often this missing value is less than 10%), the variable’s mean value should be filled. If ≥ 10% of the given variables are missing, this value is excluded from the variable screening of the final model. Similarly, this study adopted unit feature interpolation for the missing values that meet the interpolation requirements. That is, the missing values can be interpolated using the constant values provided or using the statistical data of each column where these missing values are located (e.g., average value, median value, or the most frequently occurring value)[16,17].
Construction and effectiveness evaluation of the LNM model
Based on the machine learning (ML) algorithm, the commonly used iterative algorithm models are included: Random forest classifier (RFC), decision tree (DT), support vector machine (SVM), eXtreme gradient boosting (XGBoost), and artificial neural network (ANN). The RFC is an integrated method that forms a cumulative effect by integrating multiple relatively simple evaluators. Random forest is an integrated learning tool based on DT. The SVM is a type of a generalized linear classifier that categorizes data binary through supervised learning. The ANN is a nonlinear equation transformation output algorithm comprising input, hidden, and output layers. Finally, XGBoost is an additive model. In each iteration, only the sub-models in the current step are optimized. In this study, we refer to the guide proposed by Luo et al for the best use of prediction models in biomedical research, that is, the Delphi method, which is used to generate the list of reported items.
For the screening of candidate variables, we mainly rely on the principle of “bag repeatedly put back and extract”, sort according to variables’ weight, and finally obtain the final predictor of the prediction model from the top 10 variables. For the effectiveness evaluation of the prediction model, the receiver operating characteristic (ROC) curve is used to evaluate the accuracy of the model. Meanwhile, the decision curve analysis and clinical impact curve (CIC) were used to evaluate the model’s robustness and differentiation, respectively.
The measurement and counting data in this study are expressed by interquartile spacing (25%, 75%) and percentage (%), respectively. For the comparison between groups, the continuous variables adopt the t-test or Mann-Whitney U test of independent samples (provided that it does not conform to the normal distribution). The counting data adopt the chi-square goodness-of-fit test. Values of Bonferroni corrected probability are used to compare the qualitative data. The prediction model visualization and other data analysis are performed using R software (version 4.0.4, http://www.r-project.org/). For the comparison between groups, P value < 0.05 is considered statistically significant and vice versa.
Comparison of baseline data between LNM and non-LNM queues
Table 1 summarizes the baseline characteristics of 526 hospitalized patients with UEGC. For internal validation, the patients were randomly divided into two sets using the caret package: Training set (n = 368, 70%) and validation set (n = 158, 30%). Regarding the LNM rate, the training and validation cohorts were 62 (16.85%) and 29 (18.35%), respectively. In addition to the previously reported clinical-related indicators (e.g., tumor size, infiltration depth, vascular_invasion, and vascular tumor thrombus), significant differences exist between the LNM and non-LNM groups. We found that GLCM-based texture acquisition features also have significant statistical differences between the two groups.
Table 1 Patient baseline population and image index characteristic.
Overall (n = 368)
Yes (n = 62)
No (n = 306)
Overall (n = 158)
Yes (n = 29)
No (n = 129)
Age (median, IQR), yr
51.00 (40.75, 61.00)
52.50 (40.25, 64.50)
51.00 (41.00, 60.00)
47.00 (37.25, 58.75)
53.00 (39.00, 63.00)
46.00 (37.00, 57.00)
≤ 2 cm
> 2 cm
TF (median, IQR)
3.78 (3.56, 3.99)
4.13 (3.97, 4.27)
3.70 (3.51, 3.90)
3.79 (3.52, 4.01)
4.16 (4.00, 4.31)
3.70 (3.49, 3.93)
EV (median, IQR)
0.88 (0.64, 1.12)
0.60 (0.49, 0.68)
0.98 (0.72, 1.20)
0.85 (0.65, 1.09)
0.64 (0.54, 0.70)
0.92 (0.72, 1.16)
Entropy (median, IQR)
8.68 (8.37, 8.98)
10.51 (10.07, 10.88)
8.57 (8.33, 8.83)
8.79 (8.43, 9.02)
10.44 (10.16, 10.96)
8.65 (8.38, 8.89)
IG_all (median, IQR)
2.16 (1.76, 2.47)
3.04 (2.64, 3.62)
2.03 (1.69, 2.30)
2.12 (1.75, 2.47)
2.94 (2.70, 3.54)
1.97 (1.64, 2.31)
IG_0 (median, IQR)
2.26 (1.75, 2.66)
3.34 (2.72, 3.80)
2.10 (1.69, 2.48)
2.41 (1.90, 2.79)
3.70 (3.16, 4.18)
2.22 (1.77, 2.62)
IG_45 (median, IQR)
1.88 (1.54, 2.18)
2.85 (2.32, 3.26)
1.78 (1.47, 2.04)
1.85 (1.48, 2.18)
2.73 (2.31, 3.11)
1.74 (1.40, 2.03)
IG_90 (median, IQR)
2.34 (1.85, 2.85)
3.36 (2.89, 3.84)
2.20 (1.75, 2.63)
2.42 (1.94, 2.78)
3.27 (3.03, 3.61)
2.25 (1.75, 2.61)
IV_all (median, IQR)
176.90 (148.98, 207.25)
134.80 (109.30, 163.02)
182.00 (156.00, 210.75)
175.50 (143.25, 200.75)
133.50 (105.80, 155.70)
183.00 (154.00, 206.00)
IV_all_SD (median, IQR)
4584.00 (3148.00, 6602.50)
2166.50 (1340.50, 3535.00)
5025.00 (3747.00, 7011.75)
4940.50 (2987.25, 6682.00)
2849.00 (1841.00, 3428.00)
5618.00 (3813.00, 6897.00)
IV_0 (median, IQR)
149.85 (122.75, 186.75)
96.40 (78.95, 125.82)
163.20 (134.00, 195.65)
146.70 (112.78, 185.57)
74.10 (65.60, 90.60)
158.40 (131.40, 196.20)
IV_45 (median, IQR)
239.55 (201.40, 284.75)
164.40 (123.83, 188.62)
254.30 (220.67, 290.60)
226.25 (201.25, 266.67)
157.40 (133.90, 193.80)
243.90 (214.30, 273.50)
IV_90 (median, IQR)
129.00 (103.00, 154.00)
101.00 (77.75, 119.00)
134.00 (109.25, 159.00)
124.50 (109.00, 150.75)
105.00 (77.00, 118.00)
133.00 (117.00, 156.00)
Haralick_all (median, IQR)
0.10 (0.09, 0.10)
0.12 (0.11, 0.13)
0.09 (0.09, 0.10)
0.10 (0.09, 0.10)
0.12 (0.12, 0.14)
0.09 (0.09, 0.10)
Haralick_30 (median, IQR)
0.10 (0.09, 0.11)
0.14 (0.12, 0.15)
0.10 (0.09, 0.11)
0.10 (0.09, 0.11)
0.14 (0.13, 0.15)
0.10 (0.09, 0.11)
Haralick_45 (median, IQR)
0.09 (0.08, 0.10)
0.11 (0.10, 0.12)
0.09 (0.08, 0.10)
0.09 (0.08, 0.10)
0.11 (0.10, 0.13)
0.09 (0.08, 0.10)
Haralick_90 (median, IQR)
0.11 (0.10, 0.13)
0.14 (0.12, 0.16)
0.11 (0.09, 0.12)
0.12 (0.10, 0.13)
0.15 (0.12, 0.16)
0.11 (0.09, 0.13)
CSV (median, IQR)
106.00 (102.00, 111.00)
108.00 (105.00, 111.00)
106.00 (101.00, 111.00)
107.00 (102.25, 111.00)
109.00 (105.00, 113.00)
107.00 (102.00, 111.00)
CP (median, IQR)
65.50 (60.00, 70.00)
68.00 (66.00, 71.00)
64.00 (59.00, 70.00)
64.00 (60.00, 68.00)
67.00 (64.00, 68.00)
63.00 (59.00, 68.00)
IQR: Interquartile range; TF: Total frequency; EV: Energy value; IV_0: Inertia value 0°; IV_45: Inertia value 45°; IV_90: Inertia value 90°; IG_0: Inverse gap 0°; IG_45: Inverse gap 45°; IG_90: Inverse gap 90°; IG_all: Inverse gap full angle; Haralick_30: Haralick 30°; Haralick_45: Haralick 45°; Haralick_90: Haralick 90°; Haralick_all: Haralick full angle; CSV: Cluster shadow value; CP: Cluster prominence.
Feature correlation and potential predictors
We conducted a correlation analysis on the variables with significant differences based on the statistical difference analysis of baseline data. As shown in Figure 2A, the correlation matrix (based on Pearson correlation analysis) indicates that the characteristic variables in the GLCM and LNM had a strong correlation degree (r > 0.6). For example, Entropy, Haralick full angle (Haralick_all), Haralick 30° (Haralick_30), Inverse gap full angle (IG_all), Inverse gap 45° (IG_45), Inverse gap 0° (IG_0), etc. were highly correlated with LNM. This suggests that these potential candidate variables can be used as LNM predictors and for the construction of subsequent models. Interestingly, in the subsequent models developed based on ML algorithms, we found that Entropy, Haralick_all, Haralick_30, IG_all, IG_45, IG_0, and Inertia value 45° (IV_45) occupied high weights as the top 7 GLCM-based factors (Figure 2B). Specifically, Entropy has the largest weight among these factors.
Figure 2 Variable screening and weight allocation.
A: Correlation matrix analysis of candidate features; B: Weight distribution of candidate variables for each mL based model. RFC: Random forest classifier; SVM: Support vector machine; DT: Decision tree; ANN: Artificial neural network; XGboost: Extreme gradient boosting.
Establishment and performance evaluation of the LNM prediction model
When constructing the RFC model [training set: Areas under the ROC curve (AUC): 0.925, 95% confidence interval (CI): 0.378-1.472; testing set: AUC: 0.912, 95%CI: 0.355-1.469], we repeatedly randomly selected N samples from the original training set N to generate the new training set DT and then generate M DTs to form a random forest according to the above steps. As shown in Figure 3A and Supplementary Table 1, the smallest Gini index after splitting was selected, including that for Entropy, Haralick_all, Haralick_30, IG_all, IG_45, IG_0, and IV_45. Similarly, Haralick_30 and IG_all served as important weight at DT branches (training set: AUC: 0.856, 95%CI: 0.309-1.403; testing set: AUC: 0.813, 95%CI: 0.256-1.370) (Figure 3B). In the ANN model (Figure 4), the accuracy of the prediction model developed using the prediction variables in the GLCM can also reach 0.887 (95%CI: 0.340-1.434) and 0.837 (95%CI: 0.280-1.394) in the training and verification sets, respectively. Although this accuracy is slightly inferior to that of the RFC model, it is better than those of other prediction models (i.e., DT, XGBoost, and SVM). Table 2, Supplementary Table 1, and Figure 5 summarize the predictive performance of ML-based models. In general, the prediction model constructed by using any ML algorithm was better than the logistic regression algorithm in predicting LNM, further confirming the superiority of ML algorithm, especially the robustness of the RFC.
Figure 3 Visualization model prediction based on machine learning based algorithm.
A: Random forest classifier model; B: Decision tree model. Candidate factors associated with fracture risk are named through random forest classifier algorithm, and prediction nodes and weights are assigned by the decision tree algorithm.
Figure 4 Visualization of prediction models based on artificial neural network algorithm.
A: Artificial neural network model; B: Importance of variables using connection weights. Candidate factors associated with lymph node metastasis are ordered via artificial neural network (ANN) algorithm and prediction nodes, and weights are assigned via an ANN algorithm. IV_0: Inertia value 0°; IV_45: Inertia value 45°; IG_0: Inverse gap 0°; IG_45: Inverse gap 45°; IG_all: Inverse gap full angle; Haralick_30: Haralick 30°; Haralick_all: Haralick full angle.
Figure 5 Predictive performance of candidate models based on machine learning based algorithm.
A: Decision curve analysis (DCA) for five mL based models in training sets; B: DCA for five ml based models in test sets. RFC: Random forest classifier; SVM: Support vector machine; DT: Decision tree; ANN: Artificial neural network; XGboost: Extreme gradient boosting.
Table 2 Receiver operating characteristic curve analysis of lymph node metastasis in each mL based model.
RFC: Random forest classifier; SVM: Support vector machine; DT: Decision tree; ANN: Artificial neural network; XGboost: Extreme gradient boosting; GLM; Generalized linear model; AUC: Area under the receiver operating characteristic curve; 95%CI: 95% confidence interval; IV_0: Inertia value 0°; IV_45: Inertia value 45°; IV_90: Inertia value 90°; IG_0: Inverse gap 0°; IG_45: Inverse gap 45°; IG_90: Inverse gap 90°; IG_all: Inverse gap full angle; Haralick_30: Haralick 30°.
Internal validation of the optimal RFC predictive model
The prediction efficiency of the RFC model was the best in the process of precise stratification of LNM patients. To further evaluate the “stratification effect” of the RFC, results of CIC analysis indicate that high-risk LNM was accurately distinguished using the RFC model, and “cross-linking” did not occur in the stratification process. The results of this model for the validation and training sets were consistent (Supplementary Table 2), implying that the robustness and LNM discrimination of the RFC model were satisfactory.
The standard treatment for early GC is surgery. However, recently, ER has become the standard local treatment for some patients with early GC without LNM. For a long time, it has been used to treat differentiated-type early GC limited to the mucosa, with a diameter of < 2 cm[22,23]. Recent studies have shown that ER indications have been expanded in many studies, even including UEGC and ≤ 2 cm diameter, without ulcer or lymphatic vessel invasion. However, whether UEGC can accept the standard treatment of ER remains a subject of debate. That is, additional surgery should be performed if curability is considered questionable. Given this situation, the risk factors of LNM or distant metastasis and mortality after non-curative ER of UEGC should be investigated. Previous studies have also shown that patients with two or more risk factors (e.g., ulcer, submucosal invasion, and positive vertical margin) benefit greatly from surgical resection after ER that cannot be cured by UEGC[14,25]. However, due to the heterogeneity of clinical characteristics, risk stratification based on these predictions provides a simple prediction, which is challenging to apply in clinical practice.
The potential application of the GLCM in the prediction of LNM of UEGC has not been systematically explored thus far. In this study, GLCM-based features were extracted from underlying grayscale images collected through MRI. We developed an LNM risk prediction model for patients with UEGC using an ML-based algorithm. The following are the two important findings of our study. First, the accurate risk stratification of UEGC patients who should undergo additional surgery depends on the added value of the GLCM. Second, a new ML-based prediction model was used to identify patients and whether they have LNM. According to previous studies, texture analysis can quantify the spatial differences of pixels and the subtle differences reflected in gray values, which is consistent with the conclusion of this study. To some extent, we used GLCM features to gather spatial information and reduced the overfitting effect by replacing the softmax layer with the ML-based algorithm.
In this study, we created five types of ML-based models (i.e., RFC, ANN, DT, XGBoost, and SVM), which used GLCM features to predict LNM. Interestingly, there were differences in the prediction efficiency obtained by ML-based models of different algorithms. For example, the RFC model had the highest predictive accuracy, which was achieved by incorporating Entropy, Haralick_all, Haralick_30, IG_all, IG_45, IG_0, and IV_45. Meanwhile, the ANN, DT, XGBoost, and SVM exhibited an inferior performance compared with the RFC. This suggests that the accuracy of the RFC in predicting LNM is superior to that of the ML model. A previous study indicated that a random forest algorithm is more efficient in processing classification problems, which is consistent with the results of this study. Meanwhile, DT is not as good as the RFC in terms of fitting, and the low prediction ability of the ANN model indicates that an “overfitting” phenomenon may occur. In general, different ML models show consistent accuracy, indicating that the prediction performance of ML can be improved through data processing.
Our results confirm a GLCM-based LNM classification, which has an ideal predictive effect on the diagnosis and treatment of patients with UEGC. However, the following problems were inevitably encountered in this study. First, because this study involved a retrospective analysis, the case inclusion criteria may have a certain bias on the results, which remains to be confirmed by a large sample of prospective studies in the future. Second, there were relatively few selected cases in this study, and only some parameters of the GLCM were extracted. Thus, the results of its prediction model should be verified by external data. Third, when data from multi-center and large sample studies are available in the future, it is crucial to predict the presence or absence of LNM. Additionally, the GLCM is an important imaging sequence of UEGC, and hence we will further perform other image texture analyses in subsequent research.
GLCM-based feature extraction could, in general, serve as a robust and promising tool to improve predictive efficiency for LNM in individual UEGC patients. ML adopts the algorithm of “classification and pruning” and clearer feature extraction, leading to better data fitting than the conventional prediction model. The model constructed using the RFC had the highest predictive accuracy, with the following being the most important predictors: Entropy, Haralick_all, Haralick_30, IG_all, IG_45, IG_0, and IV_45. In the future, we are still required to validate and optimize these prediction models using datasets of various scenarios to better apply them to clinical practice.
Gray-level co-occurrence matrix (GLCM) based feature extraction could serve as a robust and promising tool to improve the predictive efficiency for lymph node metastasis (LNM) of individual undifferentiated early gastric cancer (UEGC) patients. Additionally, machine learning (ML) adopts more optimized algorithms and more clear feature extraction. Models built using random forest classifier (RFC) have the highest predictive accuracy in Entropy, Haralick full angle (Haralick_all), Haralick 30° (Haralick_30), Inverse gap full angle (IG_all), Inverse gap 45° (IG_45), Inverse gap 0° (IG_0), and Inertia value 45° (IV_45). Further research is needed to develop these models for clinical practice.
The evaluation results indicate that the method of selecting radiological and textural features becomes more effective in the discrimination of LNM from UEGC patients. In addition, an ML-based prediction model developed using RFC can be used to derive treatment options and identify LNM that can improve clinical outcomes.
GLCM based feature extraction significantly correlated with LNM. The top 7 GLCM based factors included Inertia value 0°, IV_45, IG_0, IG_45, IG_all, Haralick_30, Haralick_all, and Entropy. The areas under the receiver operating characteristic (ROC) curve (AUCs) of the RFC model, support vector machine (SVM), eXtreme gradient boosting (XGBoost), artificial neural network (ANN), and decision tree (DT) ranged from 0.805 [95% confidence interval (CI): 0.258-1.352] to 0.925 (95%CI: 0.378-1.472) in the training set and from 0.794 (95%CI: 0.237-1.351) to 0.912 (95%CI: 0.355-1.469) in the testing set, respectively. The RFC (training set: AUC: 0.925, 95%CI: 0.378-1.472; testing set: AUC: 0.912, 95%CI: 0.355-1.469) model incorporating Entropy, Haralick_all, Haralick_30, IG_all, IG_45, IG_0, and IV_45 had the highest predictive accuracy.
We retrospectively selected 526 cases of UEGC confirmed by pathological examination after radical gastrectomy without endoscopic treatment in four tertiary hospitals between January 2015 to December 2021. GLCM-based features were extracted from grayscale images and ML was applied to the classification of candidate predictive variables. In order to evaluate robustness and clinical utility of each model, the following were made: ROC, decision curve analysis, and clinical impact curve.
Identifying a potential biomarker that predicts LNM is proven to be very useful in determining treatment.
To develop a ML-based integral procedure to construct the LNM gray level co-occurrence matrix (GLCM) prediction model.
The risk of LNM is the most important consideration in determining treatment strategies for UEGC. Therefore, identifying a potential biomarker that predicts LNM is proven to be very useful in determining treatment.
The authors thank all study participants for consenting to the use of their medical records.
Provenance and peer review: Unsolicited article; Externally peer reviewed.
Kato M, Nishida T, Yamamoto K, Hayashi S, Kitamura S, Yabuta T, Yoshio T, Nakamura T, Komori M, Kawai N, Nishihara A, Nakanishi F, Nakahara M, Ogiyama H, Kinoshita K, Yamada T, Iijima H, Tsujii M, Takehara T. Scheduled endoscopic surveillance controls secondary cancer after curative endoscopic resection for early gastric cancer: a multicentre retrospective cohort study by Osaka University ESD study group.Gut. 2013;62:1425-1432.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 153][Cited by in F6Publishing: 149][Article Influence: 15.3][Reference Citation Analysis (0)]