Copyright ©The Author(s) 2019.
World J Gastroenterol. Apr 14, 2019; 25(14): 1666-1683
Published online Apr 14, 2019. doi: 10.3748/wjg.v25.i14.1666
Table 1 Artificial intelligence terminology
Artificial intelligenceMachine intelligence that has cognitive functions similar to those of humans such as “learning” and “problem solving.”
Machine learningMathematical algorithms which is automatically built from given data (known as input training data) and predicts or makes decisions in uncertain conditions without being explicitly programmed
Support vector machinesDiscriminative classifier formally defined by an optimizing hyperplane with the largest functional margin
Artificial neural networksMultilayered interconnected network which consists of an input, hidden connection (between the input and output layer), and output layer
Deep learningSubset of machine learning technique that composed of multiple-layered neural network algorithms
Convolutional neural networksSpecific class of artificial neural networks that consists of (1) convolutional and pooling layers, which are the two main components to extract distinct features; and (2) fully connected layers to make an overall classification
OverfittingModelling error which occurs when a certain learning model tailors itself too much on the training dataset and predictions are not well generalized to new datasets
Spectrum biasSystematic error occurs when the dataset used for model development does not adequately represent or reflect the range of patients who will be applied in clinical practice (target population)
Table 2 Summary of clinical studies using artificial intelligence for recognition of diagnosis and prediction of prognosis
Ref.Published yearAim of studyDesign of studyNumber of subjectsType of AIInput variables (number/type)Outcomes
Pace et al[11]2005Diagnosis of gastroesophageal reflux diseaseRetrospective159 patients (10 times cross validation)“backpropagation” ANN101/clinical variablesAccuracy: 100%
Lahner et al[12]2005Recognition of atrophic corpus gastritisRetrospective350 patients (subdivided several times into training and test set equally)ANN37 to 3 /clinical and biochemical variables (experiment 1 to 5)Accuracy: 96.6%, 98.8%, 98.4%, 91.3% and 97.7% (experiment 1-5, respectively)
Pofahl et al[13]1998Prediction of length of stay for patients with acute pancreatitisRetrospective195 patients (training set: 156, test set: 39)“backpropagation” ANN71/clinical variablesSensitivity: 75 % (for prediction of a length of stay more than 7 d)
Das et al[14]2003Prediction of outcomes in acute lower gastrointestinal bleedingProspective190 patients (training set: 120, internal validation set: 70, external validation set: 142)ANN26/clinical variablesAccuracy (external validation set): 97% for death, 93% for, recurrent bleeding, 94% for need for intervention
Sato et al[15]2005Prediction of 1-year and 5-year survival of esophageal cancerRetrospective418 patients (training-: validation-: test set = 53%: 27%: 20%)ANN199/ clinicopathologic, biologic, and genetic variablesAUROC for 1 year- and 5 year survival prediction: 0.883 and 0.884, respectively
Rotondano et al[16]2011Prediction of mortality in nonvariceal upper gastrointestinal bleedingProspective, multicenter2380 patients (5 × 2 cross-validation)ANN68/clinical variablesAccuracy: 96.8%, AUROC: 0.95, sensitivity: 83.8%, specificity: 97.5%,
Takayama et al[17]2015Prediction of prognosis in ulcerative colitis after cytoapheresis therapyRetrospective90 patients (training set: 54, test set: 36)ANN13/clinical variablesSensitivity: 96.0%, specificity: 97.0%
Hardalaç et al[18]2015Prediction of mucosal healing by azathioprine therapy in IBDRetrospective129 patients (training set: 103, validation set: 13, test set: 13)“feed-forward back-propagation” and “cascade-forward” ANN6/clinical variablesTotal correct classification rate: 79.1%
Peng et al[19]2015Prediction of frequency of onset, relapse, and severity of IBDRetrospective569 UC and 332 CD patients (training set: data from 2003-2010, validation set: data in 2011)ANN5/meteorological dataAccuracy in predicting the frequency of relapse of IBD (mean square error = 0.009, mean absolute percentage error = 17.1%)
Ichimasa et al[20]2018Prediction of lymph node metastasis, thus minimizing the need for additional surgery in T1 colorectal cancerRetrospective690 patients (training set: 590, validation set: 100)SVM45/ Clinicopathological variablesAccuracy: 69%, sensitivity: 100%, specificity: 66%
Yang et al[21]2013Prediction of postoperative distant metastasis in esophageal squamous cell carcinomaRetrospective483 patients (training set: 319, validation set: 164)SVM30/7 clinicopathological variables and 23 immunomarkersAccuracy: 78.7% sensitivity: 56.6%, specificity: 97.7%, PPV: 95.6%, NPV: 72.3%
Table 3 Summary of clinical studies using artificial intelligence in the upper gastrointestinal field
Ref.Published yearAim of studyDesign of studyNumber of subjectsType of AIEndoscopic or ultrasoud modalityOutcomes
Takiyama et al[22]2018Recognition of anatomical locations of EGD imagesRetrospectiveTraining set: 27335 images from 1750 patients. Validation set: 17081 images from 435 patientsCNNWhite-light endoscopyAUROCs: 1.00 for the larynx and esophagus, and 0.99 for the stomach and duodenum recognition
van der Sommen et al[23]2016Discrimination of early neoplastic lesions in Barrett’s esophagusRetrospective100 endoscopic images from 44 patients (leave-one-out cross-validation on a per-patient basis)SVMWhite-light endoscopySensitivity: 83%, specificity: 83% (per-image analysis)
Swager et al[24]2017Identification of early Barrett’s esophagus neoplasia on ex vivo volumetric laser endomicroscopy images.Retrospective60 volumetric laser endomicroscopy imagesCombination of several methods (SVM, discriminant analysis, AdaBoost, random forest, etc)Ex vivo volumetric laser endomicroscopySensitivity: 90%, specificity: 93%
Kodashima et al[25]2007Discrimination between normal and malignant tissue at the cellular level in the esophagusProspective ex vivo pilot10 patientsImageJ programEndocytoscopyDifference in the mean ratio of total nuclei to the entire selected field, 6.4 ± 1.9% in normal tissues and 25.3 ± 3.8% in malignant samples
Shin et al[26]2015Diagnosis of esophageal squamous dysplasiaProspective, multicenter375 sites from 177 patients (training set: 104 sites, test set: 104 sites, validation set: 167 sites)Linear discriminant analysisHRMESensitivity: 87%, specificity: 97%
Quang et al[27]2016Diagnosis of esophageal squamous cell neoplasiaRetrospective, multicenterSame data from reference number 26Linear discriminant analysisTablet-interfaced HRMESensitivity: 95%, specificity: 91%
Horie et al[28]2019Diagnosis of esophageal cancerRetrospectiveTraining set: 8428 images from 384 patients. Test set: 1118 images from 97 patientsCNNWhite-light endoscopy with NBISensitivity 98%
Huang et al[29]2004Diagnosis of H. pylori infectionProspectiveTraining set: 30 patients. Test set: 74 patientsRefined feature selection with neural networkWhite-light endoscopySensitivity: 85.4%, specificity: 90.9%
Shichijo et al[30]2017Diagnosis of H. pylori InfectionRetrospectiveTraining set: CNN1: 32208 images; CNN2: images classified according to 8 different locations in the stomach. Test set: 11481 images from 397 patientsCNNWhite-light endoscopyAccuracy: 87.7%, sensitivity: 88.9%, specificity: 87.4%, diagnostic time: 194 s.
Itoh et al[31]2018Diagnosis of H. pylori infectionProspectiveTraining set: 149 images (596 images through data augmentation. Test set: 30 imagesCNNWhite-light endoscopyAUROC: 0.956, sensitivity: 86.7%, specificity: 86.7%,
Nakashima et al[32]2018Diagnosis of H. pylori infectionProspective pilot222 patients (training set: 162, test set: 60)CNNWhite-light endoscopy and image-enhanced endoscopy, such as blue laser imaging-bright and linked color imagingAUROC: 0.96 (blue laser imaging-bright), 0.95 (linked color imaging)
Kubota et al[33]2012Diagnosis of depth of invasion in gastric cancerRetrospective902 images (10 times cross validation)“backpropagation” ANNWhite-light endoscopyAccuracy: 77.2%, 49.1%, 51.0%, and 55.3% for T1-4 staging, respectively
Hirasawa et al[34]2018Detection of gastric cancersRetrospectiveTraining set: 13584 images. Test set: 2296 images.CNNWhite-light endoscopy, chromoendoscopy, NBISensitivity: 92.2%, detection rate with a diameter of 6 mm or more: 98.6%
Zhu et al[35]2018Diagnosis of depth of invasion in gastric cancer (mucosa/SM1/deeper than SM1)RetrospectiveTraining set: 790 images. Test set: 203 imagesCNNWhite-light endoscopyAccuracy: 89.2%, AUROC: 0.94, sensitivity: 74.5%, specificity: 95.6%
Kanesakaet al[36]2018Diagnosis of early gastric cancer using magnifying NBI imagesRetrospectiveTraining set: 126 images. Test set: 81 imagesSVMMagnifying NBIAccuracy: 96.3%, sensitivity: 96.7%, specificity: 95%, PPV: 98.3%,
Gatos et al[37]2017Diagnosis of chronic liver diseaseRetrospective126 patients (56 healthy controls, 70 with chronic liver diseaseSVMUltrasound shear wave elastography imaging with a stiffness value-clusteringAUROC: 0.87, highest accuracy: 87.3%, sensitivity: 93.5%, specificity: 81.2%
Kuppili et al[38]2017Detection and characterization of fatty liverProspective63 patients who underwent liver biopsy (10 times cross validation)Extreme Learning Machine to train single-layer feed-forward neural networkUltrasound liver imagesAccuracy: 96.75%, AUROC: 0.97 (validation performance)
Liu et al[39]2017Diagnosis of liver cirrhosisRetrospective44 images from controls and 47 images from patients with cirrhosisSVMUltrasound liver capsule imagesAUROC: 0.951
Table 4 Summary of clinical studies using artificial intelligence in the lower gastrointestinal field
Ref.Published yearAim of studyDesign of studyNumber of subjectsType of AIEndoscopic modalityOutcomes
Fernandez-Esparrach et al[40]2016Detection of colonic polypsRetrospective24 videos containing 31 polypsWindow Median Depth of Valleys Accumulation mapsWhite-light colonoscopySensitivity: 70.4%. Specificity: 72.4%
Misawa et al[41]2018Detection of colonic polypsRetrospective546 short videos (training set: 105 polyp-positive videos and 306 polyp-negative videos, test set: 50 polyp-positive videos and 85 polyp-negative videos) from 73 full length videosCNNWhite-light colonoscopyAccuracy: 76.5%. Sensitivity: 90.0%. Specificity: 63.3%.
Urban et al[42]2018Detection of colonic polypsRetrospective8641 images with 20 colonoscopy videosCNNWhite-light colonoscopy with NBIAccuracy: 96.4%. AUROC: 0.991
Klare et al[46]2019Detection of colonic polypsProspective55 patientsAutomated polyp detection softwareWhite-light colonoscopyPolyp detection rate: 50.9%. Adenoma detection rate: 29.1%
Wang et al[47]2018Detection of colonic polypsRetrospectiveTraining set: 5545 images from 1290 patients. Validation set A: 27113 images from 1138 patients. Validation set B: 612 images. Validation set C: 138 video clips from 110 patients. Validation set D: 54 videos from 54 patientsCNNWhite-light colonoscopyDataset A: AUROC: 0.98 for at least one polyp detection, per-image sensitivity: 94.4%, per-image specificity: 95.2%. Dataset B: per-image sensitivity: 88.2%. Dataset C: per-image sensitivity: 91.6%, per-polyp sensitivity: 100%. Dataset D: per-image specificity: 95.4%
Tischendort et al[48]2010Classification of colorectal polyps on the basis of vascularization features.Prospective pilot209 polyps from 128 patientsSVMMagnifying NBI imagesAccurate classification rate: 91.9%
Gross et al[49]2011Differentiation of small colonic polyps of < 10 mmProspective434 polyps from 214 patientsSVMMagnifying NBI imagesAccuracy: 93.1%. Sensitivity: 95.0%. Specificity: 90.3%.
Takemura et al[50]2010Classification of pit patternsRetrospectiveTraining set: 72 images. Validation set: 134 imagesHuPAS software version 1.3Magnifying endoscopic images with crystal violet stainingAccuracies of the type I, II, IIIL, and IV pit patterns of colorectal lesions: 100%, 100%, 96.6%, and 96.7%, respectively
Takemura et al[51]2012Classification of histology of colorectal tumorsRetrospectiveTraining set: 1519 images. Validation set: 371 imagesHuPAS software version 3.1 using SVMMagnifying NBI imagesAccuracy: 97.8%
Kominami et al[52]2016Classification of histology of colorectal polypsProspectiveTraining set: 2247 images from 1262 colorectal lesion. Validation: 118 colorectal lesionsSVM with logistic regressionMagnifying NBI imagesAccuracy: 93.2%, Sensitivity: 93.0%, Specificity: 93.3%, PPV: 93%, NPV: 93.3%
Byrne et al[53]2017Differentiation of histology of diminutive colorectal polypsRetrospectiveTraining set: 223 videos, Validation set: 40 videos. Test set: 125 videosCNNNBI video framesAccuracy: 94%, Sensitivity: 98%, Specificity: 83%
Chen et al[54]2018Identification of neoplastic or hyperplastic polyps of < 5 mmRetrospectiveTraining set: 2157 images. Test set: 284 imagesCNNMagnifying NBI imagesSensitivity: 96.3%, specificity: 78.1%, PPV: 89.6%, NPV: 91.5%
Komeda et al[55]2017Discrimination adenomas from non-adenomatous polypsRetrospective1200 images from the endoscopic videos (10 times cross validation)CNNWhite-light colonoscopy with NBI and chromoendoscopyAccuracy in validation: 75.1%
Mori et al[56]2015Discrimination of neoplastic changes in small polypsRetrospectiveTest set: 176 polyps form 152 patientsMultivariate regression analysisEndocytoscopyAccuracy: 89.2%, Sensitivity: 92.0%
Mori et al[57]2016Development of 2nd generation model, which was mentioned in reference number 56RetrospectiveTest set: 205 small colorectal polyps (≤ 10 mm) from 123 patientsSVMEndocytoscopyAccuracy: 89% for both diminutive(< 5 mm) and small (< 10 mm) polyps
Misawa et al[58]2016Diagnosis of colorectal lesions using microvascular findingsRetrospectiveTraining set: 979 images, validation set: 100 imagesSVMEndocytoscopy with NBIAccuracy: 90%
Mori et al[59]2018Diagnosis of neoplastic diminutive polypProspective466 diminutive polyps from 325 patientsSVMEndocytoscopy with NBI and stained imagesPrediction rate: 98.1%
Takeda et al[60]2017Diagnosis of invasive colorectal cancerRetrospectiveTraining set: 5543 images from 238 lesions. Test set: 200 imagesSVMEndocytoscopy with NBI and stained imagesAccuracy: 94.1% Sensitivity: 89.4%, Specificity: 98.9%, PPV: 98.8%, NPV: 90.1%
Maeda et al[61]2018Prediction of persistent histologic inflammation in ulcerative colitis patientsRetrospectiveTraining set: 12900 images.Test set: 9935 imagesSVMEndocytoscopy with NBIAccuracy: 91%, Sensitivity: 74%, Specificity: 97%
Table 5 Summary of clinical studies using artificial intelligence in the capsule endoscopy
Ref.Published yearAim of studyDesign of studyNumber of subjectsType of AIOutcomes
Leenhardt et al[62]2019Detection of gastrointestinal angiectasiaRetrospective600 control images and 600 typical angiectasia images (divided equally into training and test datasets)CNNSensitivity: 100%, specificity: 96%, PPV: 96%, NPV: 100%.
Zhou et al[63]2017Classification of celiac diseaseRetrospectiveTraining set: 6 celiac disease patients, 5 controls. Test set: additional 5 celiac disease patients, 5 controlsCNNSensitivity: 100%, specificity: 100% (for test dataset)
He et al[64]2018Detection of intestinal hookwormsRetrospective440000 imagesCNNSensitivity: 84.6%, specificity: 88.6%
Seguí et al[65]2016Characterization of small intestinal motilityRetrospective120000 images (training set: 100000, test set: 20000)CNNMean classification accuracy: 96%