Copyright
©The Author(s) 2025.
World J Gastroenterol. Jun 28, 2025; 31(24): 108021
Published online Jun 28, 2025. doi: 10.3748/wjg.v31.i24.108021
Published online Jun 28, 2025. doi: 10.3748/wjg.v31.i24.108021
AI algorithm | Parameters employed/study design | Sample size/control group/ validation | Outcomes | Performance | Ref. |
SVM | Multi-center data + TCGA validation | Total n = 255 (training 212 + internal validation 43); external: 4 centers + TCGA | OS/DFS risk stratification (low/moderate/high); high-risk stage II/III chemotherapy benefit | AUC training 0.773 (OS)/0.751 (DFS); validation 0.852 (OS)/0.837 (DFS) | Li et al[21] |
SVM + APINet/TransFG | Tongue features (color/morphology/coating) + microbiome (16S rDNA); multicenter prospective study | Cohort 1: GC = 328 vs NGC = 304; cohort 2: GC = 937 vs NGC = 1911 (10 centers); external: GC = 294 vs NGC = 521 (7 centers) | Distinguish GC/early GC/precancerous lesions (e.g., AG); superior to 8 blood biomarkers | Tongue model AUC: 0.89 (initial); 0.88-0.92 (internal); 0.83-0.88 (external); microbiome AUC: 0.94 (genus)/0.95 (species) | Yuan et al[22] |
SVM/LR/kNN + feature selection | Liver/PBMC RNA-seq data | Liver = 67; PBMC = 137; external public dataset; controls: Healthy + AH/AC/MASLD/HCV | Precise differentiation of AH/AC/MASLD/HCV; minimal gene sets (33-75 genes) | Liver accuracy: 90% (AH/AC vs healthy), 91% (4-class); External 82%; PBMC accuracy: 75% (4-class) | Listopad et al[23] |
SVM/LR/RF | Multiphase CT radiomics (n = 851) | Total n = 215 (training 150 + external 65) | Multiphase CT prediction (plain scan alternative) | Nomogram C-index 0.913 (95%CI: 0.878-0.956) | Liu et al[24] |
SVM | Radiomics features extracted from CT images; integrated rad-score + clinicopathological characteristics | 693 GC patients (2 centers); training (n = 390), internal validation (n = 151), external validation (n = 152) cohorts | Rad-scores significantly associated with diffuse-type GC and SRCC (P < 0.001) | Lauren nomogram: AUC = 0.895 (training), 0.841 (internal), 0.893 (external). SRCC nomogram: AUC = 0.905 (training), 0.845 (internal), 0.918 (external) | Chen et al[25] |
Counterfactual random forest + optimal policy trees | Imatinib duration inferred via counterfactual model; OPTs interpreted counterfactual predictions | Internal: 117 (MSKCC); external: 363 (polish) + 239 (spanish) | OPTs recommended no imatinib for low-risk subgroups: Gastric GIST < 15.9 cm + mitotic count < 11.5/5 mm². Any site GIST < 5.4 cm + mitotic count < 11.5/5 mm² | Sensitivity: 92.7% (internal), 95.4% (Spanish), 92.4% (Polish). Specificity: 33.9% (internal) | Bertsimas et al[26] |
Markov decision tree model | Input variables from systematic review/meta-analysis of RCTs comparing DS, EUS-GE, and GJ; prospective cohort study for EUS-GE | 15 studies in Markov model | 1-month survival: DS (81.2%), EUS-GE (80.4%) > GJ (75.5%). 6-month survival: GJ (25.2%), EUS-GE (23.8%) > DS (21.3%) | EUS-GE and GJ outperformed DS for long-term palliation (6 months) | Chue et al[27] |
Decision trees, LASSO, kNN, random forests | Pathomics features extracted from HE-stained WSIs; multicenter retrospective study | 584 gastric cancer patients (training: 325, internal validation: 113, external validation 1:73, external validation 2:73) | Pathomics signature independently predicted progression-free survival (P < 0.001, HR = 0.34) | Training: AUC = 0.985; Internal validation: AUC = 0.921 | Han et al[28] |
Optimal classification trees | Input variables: Tumour size, mitotic count, tumour site | Internal: 395 patients (MSKCC + Spanish consortium); external: 556 patients (polish registry) | OCT significantly improved calibration compared to MSK nomogram | Higher C-index for OCT (0.805 vs 0.788); slope = 1.041 (OCT) vs 0.681 (MSK); no significant calibration error for OCT | Bertsimas et al[29] |
Gradient-boosting decision tree | Baseline characteristics, endoscopic atrophy | Total: 1099 chronic gastritis patients, training: 879, test: 220 | Key predictors: Age, OLGIM/OLGA stage, endoscopic atrophy, history of other malignancies | Harrell’s c-index: 0.84 (test set). Stratified risk into 3 categories | Arai et al[30] |
GBM | Prospective cohort study (15-year follow-up); 70% training and 30% validation split | FINRISK 2002 cohort: 7115 individuals (103 incident liver disease, 41 alcoholic liver disease) | Gut microbiome and conventional factors showed comparable predictive power | Liver disease: AUROC = 0.834 (microbiome + conventional) vs 0.768 (conventional); alcoholic liver disease: AUROC = 0.956 (microbiome + conventional) vs 0.875 (conventional) | Liu et al[31] |
XGBoost (pre/delta-radiomics) + SMOTE | Pre/post-treatment MRI radiomics (n = 105); multisequence MRI integration | LARC patients n = 84; validation: 5/10-fold CV + independent; no control | Delta-radiomics > pre-radiomics | Pre-model: AUC 0.93 ± 0.06 (train)/0.79 (test); delta-model: AUC 0.96 ± 0.03 (train)/0.83 (test) | Wang et al[32] |
sPLS-DA | Multi-site microbiome (saliva/esophagus/stomach); 16S rRNA analysis | EoE: Saliva = 29, biopsy = 25; controls: Non-EoE = 20 (saliva)/5 (biopsy) | Saliva model distinguishes EoE/non-EoE; esophageal microbiota detects disease activity | Saliva: CE 24%, validation Acc 78.6% (sensitivity 80%/specificity 75%); esophagus: CE 8% (activity detection) | Facchin et al[33] |
sPLS-DA + LR | Genome-wide 5hmC features (n = 64); protein biomarkers | Healthy = 165; LC = 62; HCC = 135; longitudinal cohort | HCC diagnosis/recurrence prediction; tumor burden monitoring | Wild-score AUC = 93.24% (HCC vs non-HCC); HCC score AUC = 92.75% (HCC vs LC) | Cai et al[34] |
LR + mixed-effects model | Multicohort clinical/serologic/genetic data; JAK-STAT/IL6 pathway | IBD patients = 12083 (4 cohorts); within-case design | Female/CD colonic location/surgery linked to EIMs; MHC/CPEB4 associations; therapeutic targets (TNF/JAK-STAT) | MHC OR = 2.5 (P = 1.4E-15); CPEB4 OR = 1.5 (P = 2.7 × 10-8); serologic panel OR = 1.7 (P = 3.6× 10-19) | Khrom et al[35] |
LR + RF + kNN + SVM + NN | Recursive feature elimination; single-center retrospective | Total n = 864 (IIIa + n = 457 vs low-risk n = 407); 3-fold imputation/CV | NN outperforms others (Acc 68.8%); best in medical complications (AUC = 0.695) | NN: Overall Acc 0.688/AUC = 0.672; medical AUC = 0.695; surgical AUC = 0.653; cologne score Acc 0.510 | Jung et al[36] |
RF vs cv-Enet/glmboost/ensemble | Multicenter preoperative features; elastic-net regularization | Development = 3182 (39 centers); validation = 260; no control | RF optimal prediction; surgical decision support | RF AUC = 0.844 (0.841-0.848) (development); similar in validation | Pera et al[37] |
LR + Cox regression models | Endoscopic features (whitish/irregular) + Histology (marked IM); retrospective multicenter | Total n = 182 (malignant = 48); progression cohort = 98; ROC/KM validation | Misdiagnosis predictors (single/large/IM); progression predictors (whitish/margin/multi-diagnosis) | AUC 0.871 (sensitivity 68.7%/specificity 92.5%) | Zou et al[38] |
RF + Swin transformer tongue model | Questionnaire features (n = 10) + tongue images; multicenter | Total n = 2229 (9 centers); validation AUC > 0.8 | Key factors: Age/TCM constitution/tongue features/diet/anxiety; dynamic nomogram | RF Acc 85.65%; tongue model Acc 73.33% (validation) | Yu et al[39] |
LR | Tumor location/ulceration/biopsy features; H-L test/DCA validation | Training = 516; validation = 220 (7:3 split); no control | 4 fibrosis predictors; severe fibrosis prediction model | Raining AUC = 0.819; validation AUC = 0.812; DCA clinical benefit | Zeng et al[40] |
Stepwise logistic regression | Demographics/history/Lab markers (AFP/AST/albumin); prospective multicenter | Total n = 1723; HCC events = 109; median follow-up 2.2 years; no control | Key factors: Male/cirrhosis duration/family history/age/obesity/AFP/AST | Incidence 24/100 person-years; multivariate OR 1.08-2.73 (P < 0.05) | Reddy et al[41] |
Multivariate logistic regression | Radiomic features (peritumoral enhancement/necrosis); transcriptomic sequencing | Development = 470; validation: Control = 145 + HAIC = 143; multicenter | Imaging subtypes guide HAIC benefit; immune pathway correlation | Training AUC = 0.83; control AUC = 0.84; HAIC AUC = 0.73 | Ma et al[42] |
LR | Multiphase CT radiomics (peritumoral); RNA sequencing | Total n = 773 (training 334 + internal 142 + external 141 + survival 121 + RNA35); 4 centers | MVI prediction + survival stratification (early recurrence/OS); glucose metabolism genes | Hybrid model AUC: 0.86 (internal)/0.84 (external); survival P < 0.01 | Xia et al[43] |
Multivariate logistic regression | LI-RADS visualization score (A/B/C); obesity class II-III | Total n = 2053 (A = 1685, B = 262, C = 106); longitudinal = 1546; multicenter | Alcohol/MASLD cirrhosis + obesity linked to limited visualization; 19.6% worsened/53.1% improved | Baseline limited rate 18%; obesity OR = 2.1 (P < 0.001) | Schoenberger et al[44] |
Regularized LR + GBM | RCT secondary analysis; mailed outreach; prior screening behavior | Total n = 1200 (training 960 + test 240); 3 screening rounds; no control | Surveillance adherence stratification; key variables: Prior screening/primary care contact | AUROC 0.66-0.77 (increasing); 41%-47% completion rate | Singal et al[45] |
LASSO logistic regression | Pre/intraoperative variables; multicenter international | Total n = 2192 (train 70% + valid 30%); 12 centers | Dual prediction (PHLF/CCI > 40); online risk calculators | PHLF AUC = 0.80 (calib. slope = 0.95); CCI AUC = 0.76 | Wang et al[46] |
LDpred2 PRS + QCancer-10 integration | Genetic/non-genetic factors; Cox proportional hazards | United Kingdom Biobank n = 434587; case-control/survival validation | C-index improvement (M + 7.3%/F + 6.5%); high-risk group 3.47 × (M)/2.77 × (F) | Integrated C-index: 0.730 (M)/0.687 (F); sensitivity/specificity: 47.8%/80.3% (M), 42.7%/80.1% (F) | Briggs et al[47] |
Multivariable logistic + Cox regression | Multicenter FS screening; long-term follow-up (median 17 years) | Intervention = 40085 (13 centers) | High-ADR group: Distal CRC HR = 0.34 (incidence)/0.22 (mortality); all-site CRC HR = 0.58/0.52 | High vs low-ADR: Distal CRC HR 0.34 vs 0.55 (incidence), 0.22 vs 0.54 (mortality) | Cross et al[48] |
RRR + elastic net models | Inflammatory markers (CRP/IL6/GDF15) + metabolic markers (BMI/waist/C-peptide); case-control | Total n = 1368 (cases 684 +controls 684); NHS = 818F + HPFS = 550M | Sex-specific: Median OR = 1.34 (inflammation)/1.25 (metabolic); NS in F; 11 key metabolites | Variance explained: 24% (inflammation)/27% (metabolic) | Bever et al[49] |
RSF/GBM/Deep hit | Multivariable analysis + clinical feature selection; time-dependent C-index | CRC patients = 2157; stratified 5-fold CV (5 repeats) | Deep hit best discrimination; RSF best calibration; SHAP key factors (R0 resection/TNM) | Deep hit C-index 0.789 (0.779-0.799); RSF brier 0.096 (0.094-0.099) | Yang et al[50] |
Multivariable logistic regression | Cell search CTCs detection; prospective CTCs + retrospective HGP; excluded neoadjuvant/extrahepatic | Total n = 177 (dHGP = 34, 19%); multivariable validation; no external cohort | CTC-negativity predicts dHGP (OR = 2.7); dHGP better survival | OR = 2.7 (1.1-6.8), P = 0.028 | Meyer et al[51] |
AI algorithm | Parameters employed/study design | Sample size, control group, validation | Outcomes | Performance | Ref. |
CNN | 14 EUS anatomical sites; multicenter validation | Training: 1812 patients/6230 images; internal: 47 patients/1569 images; external: 131 patients/85322 images | Outperformed novices in 11 sites; high expert agreement (kappa 084-0.98) | Internal Acc 92.1-100%; external sensitivity 89.45%-99.92%/specificity 93.35%-99.79% | Tian et al[52] |
NNLS deconvolution + GCNN | Methylation atlas (TSMA) + genome-wide density; multi-modal strategy | 5 tumor types + WBC training; validation = 239 low-depth cfDNA | Multi-modal improves TOO in low-depth cfDNA | Validation Acc 69% | Nguyen et al[53] |
CNN + survival MLP | CT + clinical multimodal data; 5-fold CV | GC patients = 1061; vs 3 SOTA methods; no control | Multimodal > single-modality; optimal OS/PFS prediction | OS C-index 0.849; PFS 0.783 (surpass SOTA) | Hao et al[54] |
CNN | HE features for HER2 status; trastuzumab response | Surgical = 300; biopsy = 101; treated = 41; no control | HER2 amplification prediction; treatment response (CR + PR vs SD + PD) | Surgical AUC 0.847 (amplification)/0.903 (2 +); biopsy 0.723; treatment 0.833 | Wu et al[55] |
DCNN | HE whole-slide imaging; fibrosis stage comparison | Non-HCC = 639; HCC = 46; paired training/unpaired validation | Detect HCC risk in mild fibrosis; saliency maps reveal nuclear atypia/immune infiltration | Training Acc 81.0% (AUC = 0.80); validation 82.3% (AUC = 0.84) | Nakatsuka et al[56] |
Faster R-CNN model | Preoperative CT/MRI analysis; multicenter retrospective cohort (2012-2020) | Total n = 1141 (PCCCL = 62, CHCC = 1079); 4:1 split (train-val vs test); CHCC cases (n = 1079) as negative control | Differential diagnosis of rare PCCCL | Accuracy: 0.962 (95%CI: 0.931-0.992); AP: PCCCL 0.908, CHCC 0.907; Recall: 0.95 | Liu et al[57] |
Transformer | End-to-end biomarker prediction; multicenter validation | Total n > 13k (16 CRC cohorts); resection training/biopsy validation | Solved biopsy MSI diagnosis; improved interpretability | MSI detection: Sensitivity 0.99/NPV > 0.99 | Wagner et al[58] |
CNN + SMOTE/SVM | Pathomics/radiomics/immune score (CD3 +/CD8 +)/clinical; digital pathology | Lung metastasis = 103; internal validation | Path/radio features vs immunoscore (neg); triple independent prognosis | Integrated model: OS = 0.860/DFS = 0.875; Calib/DCA validated | Wang et al[59] |
INSIGHT (CNN) + wise MSI (self-attention) two-stage | Tumor tile classification + ResNet pre-trained + attention pooling; multicenter | Chinese multicenter cohort; vs 5 DL methods | Outperforms SOTA in MSI prediction; high pathologist consistency | Wise MSI AUC 0.954 (0.948-0.960) | Chang et al[60] |
CNN + RNN | Multicenter blinded trial; real-time monitoring + second observer | Total n = 946 (adenomas = 989); multicenter | CADe > human in adenoma detection (sensitivity 94.6% vs 96.0%); changed 2.3% follow-up | ADR + 1.1%/case; Non-neoplastic + 4.9%; time + 42.6% (6.6 minutes) | Sinonquel et al[61] |
ANN | Pathological image analysis; retrospective multicenter | Training = 496 (GDPH); external validation = 150 (SYSMH) | Avoided 34.9% unnecessary surgeries; outperformed United States guidelines | Training AUC = 0.979; validation AUC = 0.978 | Su et al[62] |
Multitask transformer | Preop MRI multiparametric features; 7-center retrospective | Total n = 725 (train 234 + internal 58); external = 212/111/110 | PA-TACE benefit in high-MVI/low-survival group (P < 0.001) | RFS C-index: Training 0.763/validation 0.628-0.728 | Wang et al[63] |
Multistage DL models | Longitudinal MRI (pre/post-TA) + clinical variables; multicenter retrospective | Total n = 289 (train 254 + external 35); 3 hospitals | DL clinical improved ER prediction (AUC = 0.740); High/low-risk RFS P = 0.04 | DL clinical AUC: 0.740 vs 0.571/0.648/0.689 | Kong and Li[64] |
CNN | Clinical data + MRI radiomics; 6 time-frame prediction | Early HCC = 120 (recurrence = 44); retrospective (2005-2018) | Imaging model > clinical (AUC 0.76 vs 0.68, P = 0.03) | Imaging model AUC 0.71-0.85; KM P < 0.05 (2-6 years) | Iseke et all[65] |
RSF/ANN/decision tree | Inflammatory markers + ALBI + AFP + tumor size + INR; single-center retrospective | Total n = 808 (train 2:1 split) | ANN optimal (5 years AUC = 0.85); High-risk OS HR = 7.98 (5.85-10.93) | Training AUC 0.85 (0.82-0.88); validation 0.82 (0.74-0.85); P < 0.0001 | Zhang et al[66] |
DL | DCE-MRI + clinical/radiologic features; retrospective multicenter | Total n = 355 (train 251 + internal 62 + external 42); 2 centers | Proliferative HCC prediction; fusion model improves recurrence stratification | DL + clinical + radiologic model AUC: Training 0.99/internal 0.87/external 0.80 | Qu et al[67] |
DenseNet169 + MLP | Multiphase 25D CT + clinical features + RNA-seq; multicenter retrospective | Total n = 620 (TCIA + 3 centers); internal + 2 external test sets | Stratified RFS/OS (P < 0.001); high score links WNT/MYC/KRAS activation | DLER MLP 0.891 vs DLER 0.797 vs clinical model 0.752 | Guo et al[68] |
scSE-CatBoost | Multi-site endoscopic images; CNN + scSE feature extraction | Total n = 302 (An Nan Hospital); RUT validation | Real-time Helicobacter pylori detection; NPV 100% | Acc 0.90; sensitivity 1.00/specificity 0.81; AUC = 0.88 | Lin et al[69] |
Transformer + MIL | HE WSIs; dual-task (subtype + TMB prediction) | EC = 529/918; CRC = 594/1495; vs 7 SOTA methods | Strong subtype-TMB association (fisher P < 0.001); guides immunotherapy | Outperformed SOTA in both tasks | Wang et al[70] |
GAN + ViT distillation | HE/HPS staining; multi-task prognosis (OS/TTR/TRG) | Internal = 258 CLM; two public datasets | TRG dichotomization. Acc 86.9-90.3%; 3-class Acc 78.5-82.1% | OS C-index 0.804 (± 0.014); TTR C-index 0.735 (± 0.016) | Elforaici et al[71] |
Transfer learning | HE WSIs analysis | Segmentation = 100 WSI; validation: 4 cohorts (3 internal +1 external) + 6-month series = 217 | Fine-tuning improved F1 0.797-0.949 (P < 0.00001); 100% visual overlay accuracy | Detection model AUC 0.959-0.978 (P < 0.00001) | Khan et al[72] |
DBMIA-Net | GIA + EIA modules; adaptive channel graph convolution | 5 public datasets (CVC-Clinic DB); vs SOTA methods | Enhanced generalization | 94.12% dice (vs PraNet + 4.22%); leading in 6 metrics | Zhang et al[73] |
UC-former vision transformer | Multicenter retrospective study; mayo endoscopic score prediction | Total n = 768 UC patients/15120 images; internal + 3 external validations | Surpassed senior endoscopists; strong multicenter stability | Internal Acc 90.8%; external Acc 82.4%-85.0% | Qi et al[74] |
MIST | Self-supervised contrastive learning + dual-stream MIL | Total n = 480/666 WSI (Drum Tower); external = 273 WSI (Nanjing First) | Acc comparable to pathologists (0.784 vs 0.806) | External Acc 0.784 | Cai et al[75] |
ResTransUNet | Global context (transformer) + local features (CNN); LiTS2017/3Dircadb/Chaos/Sliver07 | LiTS2017/3Dircadb/Chaos/Sliver07 | Solved small/discontinuous region segmentation; outperformed SOTA | LiTS2017 dice 09535/VOE 0.0804/RVD -0.0007 | Ou et al[76] |
GCN | Pathological micronecrosis analysis + multicenter datasets; GCN feature fusion | Total n = 752/3622 slides; internal (FAH-ZJUMS) + external (TCGA-LIHC) | Improved prognostic stratification; precise necrosis localization | Internal + 8.18%; External + 9.02%; superior C-index vs baseline | Deng et al[77] |
- Citation: Chen ZL, Wang C, Wang F. Revolutionizing gastroenterology and hepatology with artificial intelligence: From precision diagnosis to equitable healthcare through interdisciplinary practice. World J Gastroenterol 2025; 31(24): 108021
- URL: https://www.wjgnet.com/1007-9327/full/v31/i24/108021.htm
- DOI: https://dx.doi.org/10.3748/wjg.v31.i24.108021