Artificial intelligence (AI) is based on intelligent agents performing functions associated with human mind, such as learning and problem solving[1,2].
In endoscopy, AI has begun to assist the improvement of colonic polyp detection and adenoma detection rate (ADR), to discriminate between benign and precancerous lesions based on the interpretation of their superficial patterns.
Machine learning (ML) and deep learning (DL) can be considered subfields of AI. ML is a form of AI that can support decision process allowing the improvement, without any Programmation, of the algorithms applied, including data testing and the implementation of descriptive and predictive models (Figure 1).
Figure 1 Schematic model of the deep learning algorithm in endoscopy.
ML is distinguished into supervised and unsupervised methods. An instance of supervised ML, artificial neural networks (ANN), mirror the scheme function of the brain. Each neuron is a computing unit and all neurons are connected to produce a network. ML and convoluted neural network (CNN) algorithms have been created to train software to discriminate normal from abnormal regions in the lumen of the gut. For polyp detection, ML uses a fixed number of characteristics, such as polyp size, shape, and mucosal patterns.
A variety of deep learning neural network architectures are included in DL-based methods that automatically extract relevant imaging features without the human perceptual biases.
AI, BARRETT’S ESOPHAGUS AND ESOPHAGEAL CANCER
Barrett's esophagus (BE) is characterized by an unusual (metaplastic) transformation of the mucosal cells, lining the lower part of the esophagus, from normal stratified squamous epithelium to columnar one and associated with interspersed goblet cells. This condition represents a risk factor for esophageal adenocarcinoma (EAC) whose most serious prognosis is related to the late diagnosis. Moreover, 93% of patients can achieve a complete disease remission after a regular surveillance during 10 years and treatment[5-7]. Promising techniques for the management of BE with the potential of reducing the cancer risk by an accurate diagnosis of dysplasia, are being developed.
However, despite some limitations in interventional therapies, such as endoscopic resection (ER) and ablation techniques (radiofrequency ablation or cryoablation) they can help preventing the evolution into malignancy[8-11].
The recognition of neoplastic changes in BE patients is crucial and innovations in endoscopic imaging have worked for early detection of minimal epithelial neoplastic lesions based on distinct mucosal features.
In a first study, Mendel et al, introduced a useful method for generating an automatic classification based on endoscopic white light images through the learning of specific features helped by a pretrained deep residual network, instead of handcrafted texture features. The study used a data set of 100 high-resolution endoscopic images from 39 patients supplied by the Endoscopic Vision Challenge Medical Image Computing and Computer-Assisted Intervention (MICCAI). While 22 BE patients had cancerous lesions, 17 had non-cancerous BE.
The endoscopic images were independently evaluated by five experts and then compared with probability maps provided by AI, showing a strong correspondence. Since the significant of manual segmentations vary significantly, their intersection was considered as a cancerous region (C1-region) within each C1-image.
Ebigbo et al, employed two data sets to train and validate a computer-aided diagnosis (CAD) system relying on a deep CNN with a residual net (ResNet) architecture. Images consisted of 148 high-definition white light endoscopy (WLE) and narrowband imaging (NBI) images regarding 33 EAC and 41 areas of non-neoplastic BE in the Augsburg data set, while the MICCAI data set comprised 100 high-definition WLE images, 17 early EAC and 22 areas of non-neoplastic BE. CAD-DL system diagnosed EAC with a sensitivity of 97% and a specificity of 88% for WLE images, whereas a sensitivity and specificity of 94% and 80% for NBI images, respectively. CAD-DL reached a sensitivity and specificity of 92% and 100%, respectively, for the MICCAI images.
In these beginning studies, the authors developed a CAD model and displayed promising performance scores in the classification/segmentation areas during BE assessment.
However, these results were achieved using high-quality endoscopic imaging that cannot always be obtained during daily clinical practice. This system was previously developed to further increase the speed of image analysis for classification and the resolution of the dense prediction, displaying the color-coded spatial distribution of cancer probabilities.
Still based on deep CNNs and a ResNet architecture with DeepLab V.3+, a state-of-the-art encoder-decoder network was readjusted. To transfer the endoscopic Livestream to our AI system, a capture card (Avermedia, Taiwan) for image aquisition was incorporated into the endoscopic monitor and the AI system was trained by using 129 endoscopic images. All AI-image outcomes were confirmed by pathological examination of resection specimens (EAC), as well as forceps biopsies (i.e., normal BE). The AI system showed high performance scores in the categorization task with a sensitivity and specificity of 83.7% and 100%, respectively.
CNN was also used by Horie et al, that retrospectively collected 8428 training images from esophageal cancer of 384 patients through CNNs. CNN took 27 seconds to analyze 1118 test images and correctly detected esophageal cancer cases with a sensitivity of 98%. CNN detected every 7 small cancer lesions lower than 10 mm in size. This system facilitated early and rapid malignancy detection leading to a better prognosis of these patients.
AI can assist endoscopists to make targeted biopsies with high-accuracy, saving work/time-intensive random sampling, with a low sensitivity (64%) for the detection of dysplasia. An international, randomized, crossover trial, compared high-definition white-light endoscopy (HD-WLE) and NBI for detecting IM and malignancy in 123 patients with BE (mean circumferential and maximal sizes, 1.8 and 3.6 cm, respectively).
Both HD-WLE and NBI detected 104/113 (92%) patients with IM, but NBI required fewer biopsies per-patient and exhibited a significantly higher dysplasia detection rate (30% vs 21%). During endoscopic examination with NBI, all areas of HGD and cancer presented an irregular mucosal or vascular pattern. Regular NBI surface patterns did not harbor HGD or cancer, suggesting that biopsies could be potentially avoided in the latter cases. Besides, in a multicenter, randomized crossover study, using endoscopic trimodal imaging (ETMI) for detection of early neoplasia in BE, ETMI showed no improvement in overall dysplasia detection than standard video endoscopy. The diagnosis of dysplasia was still made in a significant number of patients by random biopsies, and patients with a confirmed diagnosis of LGIN had a significant risk of HGIN/carcinoma.
Van der Sommen et al used a computer algorithm to detect early neoplastic lesions in BE and employed specific texture, color filters, and ML-based on 100 images from 44 patients with BE. This system identified early neoplastic lesions on a patient-level with a sensitivity and specificity of 86% and 87%, respectively. The author assumed that the automated computer algorithm implemented for this study was able to identify early neoplastic lesions with reasonable accuracy.
De Groof et al developed a CAD system using endoscopic images of Barrett's neoplasm based on the endoscopic images of 40 Barrett's neoplastic lesions and 20 non-dysplastic BE, reaching a sensitivity and specificity for the detection of such lesions of 95% and 85%, respectively.
AI technology was applied for volumetric laser endomicroscopy (VLE) in 2017. VLE with laser marking is a broad field of advanced imaging technology that was commercially available in the United States in 2013 to facilitate dysplasia detection.
VLE can enhance the detection of neoplastic lesions in BE by performing a circumferential scan of the esophageal wall layers. Sixteen patients with BE were included in the study and a total of 222 laser markers (LMs) were placed, 97% of them were visible on WLE. All LMs were evident on VLE directly after marking, and 86% were confirmed during the post hoc analysis. LM targeting held an accuracy of 85% of cautery marks. This original study applied to humans showed that VLE-guided LM can be a possible and secure procedure.
In another study the same authors used a database of VLE images from BE endoscopic resection specimens with/without neoplasia, precisely correlated them with histology to develop a VLE prediction score. The receiving operating characteristic curve of this prediction score showed an area under the curve (AUC) of 0.81. A value ≥ 8 correlated with an 83% sensitivity and 71% specificity.
Optical coherence tomography (OCT) is a technique that produces high-resolution esophageal images through endoscopy. OCT can recognize specialized IM from epithelial squamous cells, but image criteria for distinguishing intramucosal carcinoma (IMC) and HGD from LGD, indeterminate-grade dysplasia (IGD), and specialized IM without dysplasia have not been approved yet.
Evans et al, examined 177 OCT images from patients with a histological diagnosis of BE. The histopathology analysis was IMC/HGD in 49 cases, LGD in 15, IGD in 8, specialized IM in 100, whereas gastric mucosa in 5 patients. A meaningful correlation was found between the MC/HGD histopathologic result and scores for each image feature, surface maturation, and gland architecture. When a dysplasia index determination of ≥ 2 was used, an 83% sensitivity and 75% specificity were determined for diagnosing IMC/HGD.
In a tertiary-care center, 27 BE patients underwent 50 EMRs imaged by VLE and pCLE, and were classified into neoplastic/non-neoplastic on the basis of histology result. The sensitivity and specificity of pCLE for detecting BE dysplasia, was 76% and 79%, respectively. The OCT-SI showed a sensitivity of 70% and a specificity of 60%. Moreover, the novel VLE-DA showed a sensitivity of 86%, specificity of 88% and a diagnostic accuracy of 87%.
Esophageal squamous cell carcinoma (SCC) is the sixth malignant cause of mortality worldwide and a greater percentage affect developing countries due to a delayed diagnosis. Lugol's chromoendoscopy currently represents the gold standard technique for identifying SCC during gastroscopy, despite a low specificity (about 70%) but a higher sensitivity (over 90%).
Among non-invasive tests, NBI is another approach that has a low diagnostic specificity as displayed in a randomized controlled trial (RCT), related to the physician’s experience.
High-resolution microendoscopy (HRME) has shown the potential to enhance esophageal SCC detection during screening. An automated, real-time analysis algorithm has been developed and assessed using training tests, and validation images derived from a previous in-vivo study including 177 subjects involved for screening/surveillance programs. In a post hoc analysis, the algorithm recognized malignant tumors with a 95% sensitivity and 91% specificity, in the validation dataset, while 84% and 95% in the original study. Therefore, this technology could be applied in settings with less expertise operators in interpreting HRME images.
Kodashima et al realized a computer system architecture to simplify the differentiation among neoplastic features and healthy tissues as a result of analyzing images in endocytoscopy of esophageal tissue from histopathological analysis, by analyzing the nuclear area of the collected images from 10 patients, to achieve an accurate and automatic diagnosis.
Shin et al developed a quantitative image analysis algorithm that was able to recognize squamous dysplasia from non-neoplastic mucosa. They completed an image interpretation of 177 subjects undergoing upper endoscopy for SCC screening or surveillance, by using HRME. Quantitative data from the high-resolution images were used to create an algorithm to identify high-grade squamous dysplastic lesions or invasive SCC on histopathology.
The highest performance was gained using the mean nuclear area as the input for classification, resulting in a sensitivity and specificity of 93% and 92% in the training set, 87% and 97% in the test set, 84% and 95% in an independent validation set, respectively. ER is a technique employed for treating tumors with submucosal invasion depth 1 (SM1), whereas surgical removal with/without chemo-radiotherapy is usually used for SCC cases with a tumor infiltration deeper than SM2.
Accordingly, the preoperative endoscopic estimation of the ESCC invasion depth is critical. Recently, a rapid improvement in the application of AI with DL in medicine has been realized. A study by Tokai et al, evaluated the efficacy of AI in measuring ESCC invasion depth in a set of 1751 ESCC training images. AI recognized 95.5% (279/291) of the ESCC in the 10 test images when analyzing the 279 images it correctly predicted the invasion depth of the ESCC with an 84.1% sensitivity and an 80.9% accuracy in 6 seconds, much more precise for the estimation of ESCC invasion depth from endoscopists.
AI AND GASTRIC CANCER
Gastric cancer (GC) ranks third main cause of malignancy mortality worldwide, and esophagogastroduodenoscopy (EGD) is considered the best diagnostic tool for neoplasms at their early stages. The treatment of gastric tumors depends on the depth of the submucosal invasion; indeed, for differentiated intramucosal tumors (M) or those that invade the superficial submucosal layer (≤ 500 lm: SM1) ER is provided, while those with a deep submucosal invasion (> 500 lm: SM2) should be surgically treated for the potential risk of local invasiveness and metastases. Magnifying endoscopy combined with NBI or FICE (flexible color enhancement of spectral imaging) is clinically useful in discriminating gastric malignant from non-malignant areas[30-34]. However, this optical diagnosis strictly depends on the expertise and the experience of the operator, which prevents its general use in clinical practice.
Two RCTs examined the performance of endoscopy with/without the support of AI algorithms. The first research estimated the performance of a real-time DL system, WISENSE, to control the presence of blind spots during EGD. Overall, 324 patients randomly performed endoscopy with or without the use of WISENSE that monitored blind spots with a 90% average accuracy, and a separate accuracy for each site ranging 70.2%-100% in the 107 live endoscopic videos.
The average sensitivity and specificity were 87.6% and 95%, ranging between 63.4%-100% and 75%-100%, respectively. For timing endoscopic procedure, WISENSE accurately predicted the start and end times in 93.5% (100/107) and 97.2% (104/107) videos, respectively.
Miyaki et al, developed software allowing a quantitative evaluation of mucosal GCs on magnifying gastrointestinal endoscopy images obtained with FICE. They adopted a set of features framework having densely sampled scale-invariant feature transform descriptors to magnifying FICE images of 46 intramucosal GCs then compared with histologic findings. The CAD system allowed an 86%detection accuracy, a sensitivity and specificity of 85% and 87% for a cancer diagnosis, respectively.
In the study by Kanesaka et al, a total of 127 patients with EGC contributed to 127 cancerous M-NBI images, while 20 not-EGC patients provided to 60 not-cancerous M-NBI images. The authors created software that allowed both the identification of GC and outlined the edge between malignant and non-malignant regions. This CAD algorithm was designed to investigate grey-level co-occurrence matrix characteristics of partitioned pixel slices of magnifying NBI images, and a support vector machine was used for the ML method. The models showed a 97% sensitivity and 95% specificity in distinguishing cancer, while the performance for area concordance displayed a sensitivity and specificity, of 81% and 66% respectively.
In 2018, Hirasawa et al, elaborated an AI-based diagnostic system to detect GC, using a CNN simulating the human brain.
A total of 714 among 2,296 test image sets (31.1%) confirmed GC presence, and 84.1% had moderate/severe gastric atrophy. The CNN employed 47 seconds to analyze the 2,296 test images, diagnosing overall 232 GCs, 161 as non-malignant lesions, 71 of 77 as GC lesions with a sensitivity of 92.2%. The majority of gastric lesions (98.6%) with a diameter ≥ 6 mm were precisely identified by CNN, additionally to all invasive carcinomas (T1b or deeper). The undiagnosed lesions had a superficial depression and were more frequently intramucosal cancers with a differentiated-histotype, whose discrimination from gastric inflammation was challenging also for experienced endoscopists. Another usual reason for misdiagnosis was the anatomical sites of the cardia, incisura angularis, and pylorus.
Zhu et al examined the potential of AI to address the prediction of invasion depth of early GC. In particular, they developed and validated an AI model CNN-CAD that used a deep learning algorithm for determining EGC invasion depth (“M/SM1” vs “SM2 or deeper”).
A total of 790 endoscopic images of GCs were employed for ML, while an additional 203 images, completely autonomous from the learning material, were handled as a test set. The AI model exhibited a sensitivity and specificity of 76% and 96%, respectively in distinguishing SM2 or deeper cancer invasion, with a higher diagnostic performance as compared to the one reached by endoscopists. This high specificity could lessen the overestimation of tumoral invasion, which would contribute indirectly to reduce avoidable surgeries for M/SM1 malignancies. Moreover, in this study, the CNN-CAD system also achieved significantly greater accuracy and specificity than both expert and junior trained endoscopists.
AI might assist physicians to predict prognoses of patients with GC. Some crucial clinical trials evaluating adjuvant strategies of advanced GC were produced over the past decade, but the most suitable therapy for GC is so far uncertain. Besides, two contemporary molecular landscape studies proved the presence of various molecular GC subtypes[40,41].
A DL-based model (survival recurrent network, SRN) was developed to predict survival events for a total of 1190 GC patients, based on clinical/pathology data as well as therapy regimens, predicting the outcome at each-time point during a 5-year surveillance time.
The SRN showed that the mesenchymal subtype of GC should stimulate a tailored postoperative therapeutical strategy as a consequence of its great risk of recurrence rate. Conversely, the SRN observed that GCs with microsatellite instability and the papillary type displayed significantly more favorable prognosis after chemotherapy including capecitabine and cisplatin. SRN reached a survival of 92%, 5 years after curative gastrectomy resection.
ANN model was used to evaluate 452 GC patients, determining survival times with approximately 90% accuracy, and focusing on producing an adequate ANN structure with the capacity to handle censored data. In detail, 5 sets of single time-point feed-forward ANN models were generated to predict the outcomes of GC patients at regular time intervals (every year) until the fifth year after gastrectomy. Hence, the ANN prediction models exhibited accuracy, sensitivity, and specificity ranging as follows 88.7%-90.2%, 70.2%-92.5%, and 66.7%-96.2%, respectively.
AI IN THE IDENTIFICATION OF HELICOBACTER PYLORI INFECTION
Helicobacter pylori (H. pylori) infects the epithelial gastric cells and is associated with functional dyspepsia, peptic ulcers, mucosal atrophy, intestinal metaplasia, and GC. H. pylori-associated chronic gastritis may also raise the risk of GC[45,46]. CNN technology can accurately assess H. pylori infection during conventional endoscopy without needing biopsies. In a pilot study by Zheng et al, the authors produced a Computer-Aided Decision Support System that uses CNN to estimate H. pylori infection based on endoscopic images. From 1959 patients, 77% were assigned to the derivation cohort (1507 patients; 11729 gastric images) and 56% of them had H. pylori infection (847), while 23% were selected for the validation cohort (452) and 69% of patients were H. pylori infected (310; 3755 total images).
Huang et al applied neural networks (refined feature selection with a neural network, RFSNN) to predict H. pylori-related gastric histological hallmarks based on standard endoscopic images. The authors trained the model using endoscopic images of 30 patients and used image parameters taken from a different cohort of 74 patients to generate a model to predict H. pylori infection, showing an 85% sensitivity and a 91% specificity for identifying H. pylori infection. Moreover, RFSNN revealed an accuracy higher than 80% in predicting the presence of gastric atrophy, IM, and H. pylori-related gastritis severity.
Shichijo et al produced a 22-layer deep CNN to predict H. pylori infection during real-time endoscopy. A dataset including 32208 images of 735 H. pylori-positive and 1015 H. pylori-negative patients was handled. The sensitivity/specificity/accuracy, were 81.9/83.4/83.1%, respectively, for the first CNN, and 88.9/87.4/87.7%, respectively, for the secondary CNN, employing in both cases a similar time (198 seconds and 194 seconds, respectively).
Another study group developed a CNN, preparing 179 endoscopic images obtained from 139 patients (65 were H. pylori-positive and 74 H. pylori-negative). One hundred and fifty-nine of all images were adopted as training for a standard neural network, and the remaining 30 (15 of H. pylori-negative and 15 of H. pylori-positive patients) as test images. CAD model showed an 87% sensitivity and specificity to detect H. pylori infection with an AUC of 0.96.
Nakashima et al used blue laser images (BLI)-bright and linked color imaging (LCI) on 162 patients as learning material and those from 60 patients as a test data set. From each patient, three white-light images (WLI), three BLI, and three linked color images (LCI; Fujifilm Corp.) were obtained, respectively. For WLI, the AUC was 0.66.
AI FOR COLONIC POLYPS AND COLON CANCER
Colorectal cancer (CRC) is the third most frequent malignancy in males and second in females, and the fourth most frequent cause of cancer fatality. The National Polyp Study registered that 70%-90% of CRCs can be prevented by routine endoscopic surveillance and removal of polyps, but 7%-9% of CRCs can occur despite these measures.
Around 85% of “interval cancers” are due to missed polyps or inadequately removed polyps. Adenomas are the most common precancerous lesions throughout the colon. The ADR measures the endoscopist ability to identify adenomas. The ADR ranges between 7%–53% among endoscopists making depending on their training, endoscopic removal technique, withdrawal time, quality of bowel preparation, and other procedure-dependent determinants[56,57].
Several endoscopic innovations have been promoted to increase the ADR[58,59].
A review including 5 studies on the effect of high-resolution colonoscopes on the ADR showed conflicting results; a study concluded that the ADR is raised exclusively for endoscopists with an ADR lower than 20%.
CAD analysis has the potential to aid adenoma detection further.
Urban et al, used a different and representative set of 8641 hand-labeled images from screening colonoscopies handled among over 2000 patients. They tested the models on 20 colonoscopy videos with a whole duration of 5 hours. Expert colonoscopists were asked to identify all polyps in 9 de-identified colonoscopy videos, which were selected from archived video studies, with/without the benefit of the CNN overlay. Their findings were correlated with those of the CNN using CNN assisted expert review as the reference. The CNN identified polyps with an AUC of 0.99 and an accuracy of 96.4%. Indeed, in the analysis of colonoscopy videos involving the removal of 28 polyps, 4 expert reviewers identified 8 further (missed) polyps without CNN assistance and recognized an additional 17 polyps with CNN support. All polyps removed and recognized by the expert review were discovered by CNN, which showed a 7% false-positivity rate. This strategy could improve the ADR and lower interval cancers but it requires further studies to be adequately implemented.
AI can be used during endoscopic assessment to automatically recognize colorectal polyps and distinguish between malignant and non-malignant lesions. CAD is based on the latency time between the image acquisition to its processing for the ultimate visualization on the screen. This model was able to detect polyps with a 96.5% sensitivity[62,63].
A recent RCT estimated the impact of an automatic polyp detection system based on DL during real-time endoscopy. This study enrolling 1058 patients demonstrated that the AI system enhanced ADR of almost 10%.
A prospective study of 55 patients used a prototype of a novel automated polyp detection software (APDS) for automated image-based polyp detection and with overall real-time polyp detection of 75%. Smaller polyp size and flat polyp morphology were associated with insufficient polyp detection by the APDS.
Aside from CADe machinery, CADx has been used for differentiating between adenomas and hyperplastic polyps.
Byrne et al suggested the use of computerized image analysis to diminish the variability in endoscopic detection and histological prediction. This AI model was trained using endoscopic videos and was able to discriminate among diminutive adenomas and hyperplastic polyps with high accuracy. Additionally, it predicted histology with a 94% accuracy, 98% sensitivity, 83% specificity, a negative and positive predictive value of 97% and 90%, respectively.
Moreover, an AI-assisted image classifier, based on non-optical magnified endoscopic NBI, has been employed to predict the histology of isolated colonic lesions, following the evaluation of 3509 colonic lesions. The most prevalent histological types were tubular adenoma (47.6%), carcinoma with deep invasion (15.9%), carcinomas with superficial invasion (7.9%), hyperplastic polyps (14.3%), sessile serrated polyps (7.9%) and tubulovillous adenomas (6.6%). The sensitivity of hyperplastic and serrated polyps was 96.6%, although it was lower for tubular adenoma and cancer. When investigating only diminutive colonic polyps, the correlation of surveillance colonoscopy interval using AI image classifier and histology was 0.97. Moreover, this classifier also showed high accuracy (88.2%) in the prediction of carcinoma with deep invasion, which is not endoscopically curable, and the HNPV and accuracy for carcinoma with deep invasion also suggested that it can assist to select treatable lesions.
The same author assessed the use of AI-assisted image classifiers in determining the feasibility of ER of large colonic lesions based on non-magnified images. The independent testing set included 76 large colonic lesions that fulfilled the indications for endoscopic submucosal dissection. Overall, the trained AI image classifier showed a 88.2% sensitivity (95%CI: 84.7-91.1%) in differentiating endoscopically curable vs incurable lesions with a 77.9% specificity (95%CI: 70.3-84.4%) and 85.5% accuracy (95%CI: 82.4-88.3%). This study determined a high accuracy of the trained AI image classifier in predicting the feasibility of curative ER of large colonic lesions. While the progress of AI using CNN is great for the recognition of specific mucosal patterns and image classification, in the next future the prediction performance might outperform an expert endoscopist.
Hotta et al aimed to validate the effectiveness of endocytoscopy (EC)-CAD in diagnosing malignant or non-malignant colorectal lesions, by comparing diagnostic ability between expert and non-expert endoscopists, by using web-based tests. A validation test was produced using endocytoscopic images of 100 small colorectal lesions (< 10 mm). Diagnostic accuracies and sensitivities of EB-01 and non-expert for stained endocytoscopic images were 98.0% vs 69.0%, showing a diagnostic accuracy and sensitivity significantly higher to non-expert endoscopists when diagnosing small colorectal lesions.
A single-group open-label prospective study assessed the performance of real-time EC-CAD on 791 consecutive patients undergoing colonoscopy and 23 endoscopists to differentiate neoplastic polyps (adenomas) requiring resection from non-neoplastic polyps not requiring treatment, potentially reducing cost. The results revealed a 96.4% negative predictive value of CAD with stained mode in the best-case whereas 93.7% in the worst-case scenario. Wile by using NBI, 96.5%, and 95.2% in the best and worst-case scenario.
Another study developed an automatic quality control system (AQCS) and assessed a hypothetical improvement of polyp and adenoma detection in clinical practice based on deep CNN. The primary outcome of the study was to assess the ADR in the 308 AQCS and 315 control group patients. AQCS significantly increased the ADR than the control group. A significant improvement was similarly seen in the polyp detection rate and the mean number of polyps identified per-procedure.
Finally, in a study including 117 patients with stage IIA CRC after radical surgery, an ANN-based scoring system, based on the tumor molecular features, recognized those with a high, moderate, and low probability of survival at 10-year surveillance interval. The 10-year overall survival rates were 16.7%, 62.9%, and 100% (P < 0.001), whereas the 10-year disease-free survival rates were 16.7%, 61.8%, and 98.8%, respectively. This study revealed that the scoring system for stage IIA CRC high-risk individuals for a more aggressive therapeutic approach.
DL distinguishes patients with a complete response to neoadjuvant chemoradiotherapy for locally advanced rectal cancer with an 80% accuracy. This technology support might allow to choose patients particularly benefitting the conservative treatment than complete surgical resection. This is the first study using DL to predict total pathological response after neoadjuvant chemoradiotherapy in locally advanced rectal cancer.
AI could represent an essential diagnostic method for endoscopists and gastroenterologists for the patient's treatments tailoring and prediction of their clinical outcomes.
AI seems particularly valuable in gastrointestinal endoscopy, to improve the detection of premalignant lesions and malignant, or inflammatory lesions, gastrointestinal bleeding, and pancreaticobiliary diseases.
However, current limitations of AI include the lack of high-quality datasets for ML development. Moreover, a substantial evidence used to elaborate ML algorithms comes only from preclinical studies. Potential selection biases cannot be excluded in such cases. In this setting, a rigorous validation of AI performance before its employment in daily clinical practice is necessary.
A real measure of AI accuracy, should include as a side effect in the performances overfitting and spectrum bias.
Overfitting occurs when a learning model tailors itself too much on the training dataset and predictions are not well generalized to new datasets[75,76]. This effect is in open contradiction with the problem-solving principle of Occam’s razor, which states that simpler theories have a higher quality of prediction. In worst cases of AI algorithm application, underfitting can occur, obtaining models that cannot evidence accurately the underlying structure of the dataset, thus obtaining also bad predictivity model features.
On the other hand, spectrum bias happens when the dataset used for model development is not representative of the target population[75,79]. To avoid an overestimation of the accuracy and generalization, an external validation dataset collected in a way that minimizes the spectrum bias, should be guaranteed. Besides, well-designed multicenter observational studies, are required for a stronger validation.
Certainly, it is also noteworthy to acknowledge ethical issues since AI is not aware of the patient’s choices or legal liabilities. The privacy issues could be addressed using federated datasets that don’t involve centralized servers.
Future randomized studies could directly increase the overall value (quality vs cost) of the CNN by examining its effects on surveillance colonoscopy, endoscopic time, polyps and ADR, and pathology charges.
Since AI science is in progress, the current limitations must be considered as a future challenge, so actually they are inherited also in the medicine applications, including difficult predictability of situations characterized by some uncertainty.
In general, AI is revolutionizing the technology and impacting also other ethical aspects like human work replacement by machines, but this has always been an open question since the industrial revolution.
What can be done is to promote the mutual collaboration through gastrointestinal endoscopy applications, to reciprocally benefit from the achievements in both science fields.