Minireviews Open Access
Copyright ©The Author(s) 2021. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. Jun 7, 2021; 27(21): 2818-2833
Published online Jun 7, 2021. doi: 10.3748/wjg.v27.i21.2818
Requirements for implementation of artificial intelligence in the practice of gastrointestinal pathology
Hiroshi Yoshida, Tomoharu Kiyuna
Hiroshi Yoshida, Department of Diagnostic Pathology, National Cancer Center Hospital, Tokyo 104-0045, Japan
Tomoharu Kiyuna, Digital Healthcare Business Development Office, NEC Corporation, Tokyo 108-8556, Japan
ORCID number: Hiroshi Yoshida (0000-0002-7569-7813); Tomoharu Kiyuna (0000-0003-3050-6718).
Author contributions: Yoshida H and Kiyuna T contributed equally to this work.
Conflict-of-interest statement: All authors have no competing interests to be declared.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Hiroshi Yoshida, MD, PhD, Staff Physician, Department of Diagnostic Pathology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan. hiroyosh@ncc.go.jp
Received: February 4, 2021
Peer-review started: February 4, 2021
First decision: March 6, 2021
Revised: March 16, 2021
Accepted: April 28, 2021
Article in press: April 28, 2021
Published online: June 7, 2021

Abstract

Tremendous advances in artificial intelligence (AI) in medical image analysis have been achieved in recent years. The integration of AI is expected to cause a revolution in various areas of medicine, including gastrointestinal (GI) pathology. Currently, deep learning algorithms have shown promising benefits in areas of diagnostic histopathology, such as tumor identification, classification, prognosis prediction, and biomarker/genetic alteration prediction. While AI cannot substitute pathologists, carefully constructed AI applications may increase workforce productivity and diagnostic accuracy in pathology practice. Regardless of these promising advances, unlike the areas of radiology or cardiology imaging, no histopathology-based AI application has been approved by a regulatory authority or for public reimbursement. Thus, implying that there are still some obstacles to be overcome before AI applications can be safely and effectively implemented in real-life pathology practice. The challenges have been identified at different stages of the development process, such as needs identification, data curation, model development, validation, regulation, modification of daily workflow, and cost-effectiveness balance. The aim of this review is to present challenges in the process of AI development, validation, and regulation that should be overcome for its implementation in real-life GI pathology practice.

Key Words: Artificial intelligence, Deep learning, Digital image analysis, Digital pathology, Clinical implementation, Gastrointestinal cancer

Core Tip: The advances in artificial intelligence (AI) will revolutionize medical practice, as well as other areas of medicine. Deep learning algorithms have shown promising benefits in various areas of diagnostic histopathology. Despite this, AI technology is not widely used as a medical device and is not approved by a regulatory authority. Thus, implying that certain improvements in the development process are still necessary for the implementation of AI in the real-life histopathology-practice. This paper aims to provide a review of recent AI developments in gastrointestinal pathology and the challenges in their implementation.



INTRODUCTION

The integration of artificial intelligence (AI) will cause a revolution in various areas of medicine[1], including gastrointestinal (GI) pathology, in the next decade. Advances in slide scanner technology have made it possible to quickly digitalize histological slides at high resolution, which could be used in clinical practice, research, and education [2-4]. The drastic increase in computing capacity and improvement in information technology (IT) infrastructure has allowed rapid and efficient processing of large-sized data such as whole slide images (WSIs). In recent years, there has been an increase in computer applications utilizing AI to analyze images[5].

AI is an umbrella terminology for the different strategies a computer can employ to think and learn like a human. Pathological AI models have progressed from expert systems to conventional machine learning (ML) and deep learning (DL)[6]. Both expert systems and conventional ML use expert knowledge and expert-defined rules about objects. On the contrary, DL directly extracts features from the raw data and leverages multiple hidden layers of data for the output[7] (Figure 1). Compared to conventional ML, DL is simpler to conduct, performs with high-precision, and is cost-effective[5,8]. Its implementation enhances the reproducibility of the subjective visual assessment by human pathologists and integrates multiple parameters for precision medicine[9,10]. Currently, DL algorithms have shown promising benefits in different facets of diagnostic histopathology, such as tumor identification, classification, prognosis prediction, and biomarker/genetic alteration prediction[5,11]. In addition, various AI applications have been developed for GI pathology[12-14].

Figure 1
Figure 1 General workflow of construction of artificial intelligence model in pathology. Stained slides are converted to digital input images by a slide scanner. Both (a) hand-crafted feature engineering and (b) deep learning approach generate outputs of classification, which are applied to various clinically relevant predictions.

AI applications using DL algorithms have demonstrated various benefits in the field of GI pathology. Recent reviews (gastric and colorectal) provide an overview of the rapid and extensive progress in the field[5,11-14]. In 2017, the Philips IntelliSite (Philips Electronics, Amsterdam, The Netherlands) whole-slide scanner was approved by the Food and Drug Administration (FDA) in the United States. The implementation of AI in pathology is also promoted by various startups such as DeepLens[15] and PathAI[16]. Some institutions have agreed to digitize their pathology workflow[17,18]. Although these advances are promising, unlike in the field of radiology or cardiology imaging[19], no histopathology-related AI application has been approved by a regulatory authority or for public reimbursement. This indicates that there are still many obstacles to be resolved before the introduction of AI applications in real-life histopathology practice (Figure 2).

Figure 2
Figure 2 Challenges for implementation in the development process of an artificial intelligence application. The process of development and implementation of an artificial intelligence (AI) application is composed of multiple steps from needs identification to use in real-life (left). In each step, various challenges keep AI applications from being implemented into clinical practice (right). AI: Artificial intelligence; IT: Information technology.

In this review, we aim to present and summarize challenges in the process of development, validation, and regulation that should be overcome for the implementation of AI in real-life GI pathology practice. The complete and comprehensive review of the literature on GI pathology-related AI applications is beyond the scope of this paper and is well described elsewhere[12-14]. Here, we focused on how we can adopt these recent advancements in our daily practice.

AI-APPLICATIONS IN GI PATHOLOGY

AI applications in tumor pathology, including GI cancers[4,5] have been developed for tumor diagnosis, subtyping, grading, staging, prognosis prediction, and identification of biomarkers and genetic alterations. In the current decade, the implementation of DL technologies has dramatically improved the accuracy of digital image analysis[5]. DL is one of the ML methods that are particularly effective for digital image analysis[6]. DL is based on the use of convolutional neural networks (CNNs), consisting of millions of artificial neurons, assembled in several layers that are capable of translating its input data (pixel value matrix for an image) into a more abstract representation (Figure 1). The various layers of mathematical computation are fed into a dataset of digitized images annotated with a specific label (e.g., carcinoma or benign lesion); ultimately, the CNN learns how to categorize images according to their respective labels. They automatically identify the most distinctive and common characteristics of each type of object. CNNs outperform hand-crafted or conventional ML techniques (using support vector machines or random forests), by a substantial margin, in image classification[8,20]. In GI pathology, the prediction targets also include tumor classification, the clinical outcome of the patient, and genetic alterations within the tumor (Tables 1 and 2).

Table 1 Artificial intelligence applications in gastric cancer pathology.
Ref.
Task
No. of cases/data set
Machine learning method
Performance
Bollschweiler et al[79]Prognosis prediction135 casesANNAccuracy (93%)
Duraipandian et al[80]Tumor classification700 slidesGastricNetAccuracy (100%)
Cosatto et al[65]Tumor classification> 12000 WSIsMILAUC (0.96)
Sharma et al[21]Tumor classification454 casesCNNAccuracy (69% for cancer classification), accuracy (81% for necrosis detection)
Jiang et al[81]Prognosis prediction786 casesSVM classifierAUCs (up to 0.83)
Qu et al[82]Tumor classification9720 imagesDLAUCs (up to 0.97)
Yoshida et al[23]Tumor classification3062 gastric biopsy specimensMLOverall concordance rate (55.6%)
Kather et al[34]Prediction of microsatellite instability1147 cases (gastric and colorectal cancer)Deep residual learningAUC (0.81 for gastric cancer; 0.84 for colorectal cancer)
Garcia et al[30]Tumor classification3257 imagesCNNAccuracy (96.9%)
León et al[83]Tumor classification40 imagesCNNAccuracy (up to 89.7%)
Fu et al[32]Prediction of genomic alterations, gene expression profiling, and immune infiltration> 1000 cases (gastric, colorectal, esophageal, and liver cancers)Neural networks.AUC (0.9) for BRAF mutations prediction in thyroid cancers
Liang et al[84]Tumor classification1900 imagesDLAccuracy (91.1%)
Sun et al[85]Tumor classification500 imagesDLAccuracy (91.6%)
Tomita et al[24]Tumor classification502 cases (esophageal adenocarcinoma and Barret esophagus)Attention-based deep learningAccuracy (83%)
Wang et al[86]Tumor classification608 imagesRecalibrated multi-instance deep learningAccuracy (86.5%)
Iizuka et al[22]Tumor classification1746 biopsy WSIsCNN, RNNAUCs (up to 0.98), accuracy (95.6%)
Kather et al[33]Prediction of genetic alterations and gene expression signatures> 1000 cases (gastric, colorectal, and pancreatic cancer)Neural networksAUC (up to 0.8)
Table 2 Artificial intelligence applications in colorectal cancer pathology.
Ref.
Task
No. of cases/data set
Machine learning method
Performance
Xu et al[38]Tumor classification: 6 classes (NL/ADC/MC/SC/PC/CCTA)717 patchesAlexNetAccuracy (97.5%)
Awan et al[87]Tumor classification: Normal/Low-grade cancer/High-grade cancer454 casesNeural networksAccuracy (97%, for 2-class; 91%, for 3-class)
Haj-Hassan et al[37]Tumor classification: 3 classes (NL/AD/ADC)30 multispectral image patchesCNNAccuracy (99.2%)
Kainz et al[88]Tumor classification: Benign/Malignant165 imagesCNN (LeNet-5)Accuracy (95%-98%)
Korbar et al[36]Tumor classification: 6 classes (NL/HP/SSP/TSA/TA/TVA-VA)697 casesResNetAccuracy (93.0%)
Yoshida et al[35]Tumor classification1328 colorectal biopsy WSIsMLAccuracy (90.1%, adenoma)
Alom et al[89]Tumor microenvironment analysis: Classification, Segmentation and Detection21135 patchesDCRN/R2U-NetAccuracy (91.1%, classification)
Bychkov et al[42]Prediction of colorectal cancer outcome (5-yr disease-specific survival).420 casesRecurrent neural networksHR of 2.3, AUC (0.69)
Weis et al[90]Evaluation of tumor budding401 casesCNNCorrelation R (0.86)
Ponzio et al[91]Tumor classification: 3 classes (NL/AD/ADC)27 WSIs (13500 patches)VGG16Accuracy (96 %)
Kather et al[34]Tumor classification: 2 classes (NL/Tumor)94 WSIsResNet18AUC (> 0.99)
Kather et al[34]Prediction of microsatellite instability360 TCGA- DX (93408 patches), 378 TCGA- KR (60894 patches)ResNet18AUC: TCGA-DX—(0.77, TCGA-DX; 0.84, TCGA-KR)
Kather et al[26]Tumor microenvironment analysis: classification of 9 cell types86 WSIs (100000)VGG19Accuracy (94%-99%)
Kather et al[26]Prognosis predictions1296 WSIsVGG19Accuracy (94%-99%)
Kather et al[26]Prognosis prediction934 casesDeep learning (comparison of 5 networks)HR for overall survival of 1.99 (training set) and 1.63 (test set)
Geessink et al[29]Prognosis prediction, quantification of intratumoral stroma129 casesNeural networksHRs of 2.04 for disease-free survival
Sena et al[40]Tumor classification: 4 classes (NL/HP/AD/ADC)393 WSIs (12,565 patches)CNNAccuracy (80%)
Shapcott et al[92]Tumor microenvironment analysis: detection and classification853 patches and 142 TCGA imagesCNN with a grid-based attention networkAccuracy (84%, training set; 65%, test set)
Sirinukunwattana et al[31]Prediction of consensus molecular subtypes of colorectal cancer1206 casesNeural networks with domain-adversarial learningAUC (0.84 and 0.95 in the two validation sets)
Swiderska-Chadaj et al[93]Tumor Microenvironment Analysis: Detection of immune cell, CD3+, CD8+28 WSIsFCN/LSM/U-NetSensitivity (74.0%)
Yoon et al[39]Tumor classification: 2 classes (NL/Tumor)57 WSIs (10280 patches)VGGAccuracy (93.5%)
Echle et al[46]Prediction of microsatellite instability8836 casesShuffleNet Deep learningAUC (0.92 in development cohort; 0.96 in validation cohort)
Iizuka et al[22]Tumor classification: 3 classes (NL/AD/ADC)4036 WSIsCNN/RNNAUCs (0.96, ADC; 0.99, AD)
Skrede et al[28]Prognosis predictions2022 casesNeural networks with multiple instance learningHR (3.04 after adjusting for established prognostic markers)

In addition, a variety of ML methods have been developed. The strengths and weaknesses of typical ML methods are summarized in Table 3. All of the current ML methods have their advantages and disadvantages, and it is necessary to select an appropriate method according to the purpose of image analysis. DL-based methods are most commonly used in current image analysis of GI pathology; however, they have limitations of requiring substantial data sets and insufficient interpretability. In the future, the development of new ML methods that can compensate for the disadvantages of current ML methods will further accelerate the development of AI-models.

Table 3 Advantages and disadvantages of representative machine-learning methods in the development of artificial intelligence-models for gastrointestinal pathology.
AI model
Advantages
Disadvantages
Conventional ML (supervised)User can reflect domain knowledge to featuresRequires hand-crafted features; Accuracy depends heavily on the quality of feature extraction
Conventional ML (unsupervised)Executable without labelsResults are often unstable; Interpretability of the results
Deep neural networks (CNN)Automatic feature extraction; High accuracyRequires a large dataset; Low explainability (Black box)
Multi-instance learningExecutable without detailed labelsRequires a large dataset; High computational cost
Semantic segmentation (FCN, U-Net)Pixel-level detection gives the position, size, and shape of the targetHigh labeling cost
Recurrent neural networksLearn sequential dataHigh computational cost
Generative adversarial networksLearn to synthesize new realistic dataComplexity and instability in training
Histopathological AI-applications in gastric cancer

Several attempts have been made to classify pathological images of gastric cancer using AI (Table 1). Before we go into details of AI research review, it should be noted that the comparison of performances should not rely only on accuracy; we should pay attention to the task difficulty in the research framework, i.e., (1) dataset size (results for small sample size are less reliable), (2) resolution of detection (tissue level or region level), (3) number of categories to be classified, (4) multi-site validation (sources of training and test dataset are from the same site or not), and (5) constraints on target lesion (e.g., adenocarcinoma only, or any lesions except lymphoma). Sharma and colleagues documented the detection of gastric cancer in histopathological images using two DL-based methods: one analyzed the morphological features of the whole image, while the other investigated the focal features of the image independently. These models showed an average accuracy of up to 89.7%[21]. Iizuka et al[22] reported an AI algorithm, based on CNNs and recurrent neural networks, to classify gastric biopsy images into gastric adenocarcinoma, adenoma, and non-neoplastic tissue. Within three independent test datasets, the algorithm demonstrated an area under the curve (AUC) of 0.97 for the classification of gastric adenocarcinoma. Yoshida et al[23], using gastric biopsy specimens, contrasted the classification outcomes of experienced pathologists with those of the NEC Corporation-built ML-based program "e-Pathologist". While the total concordance rate between them was only 55.6 percent (1702/3062), the concordance rate was as high as 90.6 percent (1033/1140) for the biopsy specimens negative for a neoplastic lesion. Tomita et al[24] attempted to automate the identification of pre-neoplastic/neoplastic lesions in Barrett esophagus or gastric adenomas/adenocarcinomas.

The above tumor classification studies have shown that AI can be used for histopathological image analysis. However, other obstacles are hindering its use in real-life practice. For example, although the workload of pathologists can be minimized, by defining cases for no further review by a pathologist, even in "negative" gastric biopsies, other findings, in addition to neoplastic lesions, such as Helicobacter pylori infection, need to be reviewed and recorded. Therefore, AI application cannot be functional until it sufficiently represents diagnostic procedures of real-life practice.

The prediction of prognosis from histopathological images of GI cancers is also an attractive area for AI application. Considering the many types of histopathological prognostic features of cancer, such as tumor differentiation or lymphovascular involvement, the unveiling of hidden morphological features may be expected from AI for better prediction of clinical outcomes from the histopathological images alone[25-27]. After ingesting a sufficient number of histopathological images from patients with known outcomes, AI may comprehensively predict the patient's future outcomes. Recently, an exponentially increasing number of studies conducted for major GI cancers have demonstrated the feasibility of this concept[26,28,29]. Additionally, according to a recent study, tumor-infiltrating lymphocytes were associated with the prognosis of patients with gastric cancer[30]. CNN model may detect tumor-infiltrating lymphocytes on histopathological specimens with an acceptable accuracy of 96.9%[30]. The development of DL models that incorporate clinical and multi-omics data is also a promising approach for predictive purposes[19]. Prognosis prediction by AI applications might be more accurate than that by the conventional pathological method; however, these AI-based predictions alone seem not to be accepted in clinical practice due to lack of interpretability. If doctors and patients cannot understand the reason for prediction, they will not recognize misprediction by AI. We cannot provide patients’ care based on prediction as in “fortune-telling.” Biological and clinical reasons for the prediction by AI application must be understood prior to its implementation into clinical practice.

Some researchers have also attempted to predict biomarker status from histopathological images alone using AI applications. Specimens of various GI cancers can be processed to identify molecular markers that may predict responses to targeted therapies. Research has shown that certain clinically relevant molecular alterations in GI cancers are associated with specific histopathological features detected on hematoxylin-eosin (HE) slides; there have been some successful attempts to adopt AI applications for HE sections as surrogate markers for these alterations[31-34].

Histopathological AI-applications in colorectal cancer

As in gastric cancer, various AI applications have recently been developed for colorectal cancer (Table 2). Regarding tumor classification, several AI algorithms have been trained to classify the dataset into two to six specific classes, such as normal, hyperplasia, adenoma, adenocarcinoma, and histological subtypes of polyps or adenocarcinomas[22,35-40]. Korbar et al[36] reported that the AI model, constructed using over 400 WSIs, could classify five types of colorectal polyps with an accuracy of 93%. Wei et al[41] demonstrated that the DL model, trained using WSIs, could classify colorectal polyps, even in datasets from the other hospitals, with reproducibility. Its accuracy was comparable to that of a local pathologist. While most researches exhibit promising performance, a precise comparison of performances among these AI applications is impossible and irrelevant; each model is derived from different datasets with different annotations and focuses on different tasks. To accurately compare the performance of AI models, it is necessary to have them perform a common task using a standardized dataset with standardized annotations.

Further, a few studies have predicted prognosis using pathological images for colorectal cancer[26,34,42]. Bychkov et al[42] used 420 tissue microarray-WSIs to predict the 5-year disease-specific survival of patients and obtained an AUC of 0.69. Kather et al[26] used more than 1000 histological images, collected from three institutions, to predict the prognosis of the patient; they observed accuracy of 99%. Another study, using the ResNet model for direct identification of microsatellite instability (MSI) on histological images, demonstrated an AUC of 0.77 for both FFPE and frozen specimens from The Cancer Genome Atlas (TCGA)[34]. The identification of colorectal cancer with MSI is crucial; these tumors are reportedly highly responsive to immunomodulating therapies[43,44]; moreover, the MSI could be a clue for the diagnosis of Lynch syndrome[45]. MSI is usually identified by polymerase chain reaction (PCR), but not all patients are screened for MSI in clinical practice. Echle et al[46] recently developed a DL model to detect colorectal cancer with MSI using more than 8800 images. The DL algorithm demonstrated an AUC of 0.96 in the multi-institutional validation cohort. Furthermore, the consensus molecular subtype of colorectal cancer could be predicted from the images of colorectal surgical specimens using a CNN-based model[31]. Although prediction of molecular alterations by AI application might seem attractive, as clinically relevant biomarkers cannot be identified using HE stained slides and conventional PCR assay are both expensive and time-consuming, AI can neither achieve complete concordance with the gold standard test nor replace it. Thus, users must consider how to employ AI for predicting biomarkers with an appropriate, cost-effective balance in real-life practice.

A ROAD TO IMPLEMENTATION OF AI APPLICATIONS INTO REAL-LIFE PRACTICE

To achieve clinical implementation of the AI, several steps should be considered (Figure 2). Colling et al[47] presented an expected roadmap for the routine use of AI in pathology practice. They highlighted the main aspects of designing and applying AI in daily practice. The steps concerning design creation, ethics, financing, development, validation and regulation, implementation, and effect on the workforce were closely reviewed. For pathological image analysis, various problems exist in the execution of these steps, which would prevent the AI from being implemented in the clinical practice for GI cancers.

Identification of the true needs in daily practice

AI applications can either conduct routine tasks, usually performed by pathologists, or offer novel insights into diseases that are not possible by human pathologists[12]. The applications are needed to fill gaps and address unmet needs without impacting the daily workflow in the pathology department. The needs include mitosis detection, tumor-percentage calculation, lymph node metastasis, and other activities that are considered monotonous, repetitive, or vulnerable to higher interobserver variability.

The initial step in the development of the AI application is to recognize the true clinical need and define a possible solution. The novel AI applications can be developed by various stakeholders, including pathologists, physicians, computer scientists, engineers, IT companies, and drug companies. However, viewpoints between the professionals in academia and industry differ. For example, individuals in academia and businesses have different goals, such as grant funding, academic publications, and profitable commercial products.

Even if there is a problem that pathologists are eager to solve, the market size of the problem could be small. If the cost of developing an AI application to solve the problem cannot be recovered by the subsequent profit from the sale of the application, the company may not develop it. There is a wide range of classification tasks in diagnostic pathology, and it is difficult to secure an appropriate market for an AI application specializing only in a single task. For example, an AI algorithm can detect lymph node metastases in breast cancer as reliably as human pathologists[48,49]. Still, this tool has not been widely used or approved by the regulatory authorities. Although there could be many reasons, one is the imbalance between the overall cost of its implementation and the benefit of detecting only breast cancer lymph node metastases in real-life pathology practice.

Another significant concern is obtaining consent for the use of patient data in AI-model development[50]. Although the consent for research use could be obtained in most studies, patients might not consent to commercial use of their data required for product development, which could be an obstacle when developing products for clinical implementation. Therefore, consent should be obtained at the beginning of the research, conveying the possibility of its commercial use for product development; a framework for global data sharing should be developed.

For the development of AI algorithms, at least three parties need to collaborate, which include pathologists who know the true needs, academic professionals who can develop technology, and companies that will promote AI applications as products. In addition, to obtain a sufficiently sized market, it may be vital to develop global networks and online services using the cloud.

Development

After a concept of AI has been conceived and collaboratively established, the development of AI is carried out through the following steps: defining the output, designing the algorithm, collection of a pilot or larger follow-up sample, annotation and processing of data, and performing statistical analysis of the data.

High-quality data set curation is one of the major hurdles in the development of AI applications. Generally, CNNs require hundreds or thousands of data sets of pathological images to achieve significant performance and sufficient generalizability[51]. For rare tumors, researchers can obtain a very limited number of images; thus, it requires efficient data augmentation techniques and learning methods to resolve this issue. Conversely, in the case of transfer learning, small-scale datasets consisting of < 100 digital slides may suffice[52].

In addition, publicly available datasets should be developed for global data sharing. However, few such datasets are available in pathology, partly due to confidentiality, copyright, and financial problems[53]. Even under such circumstances, TCGA provides many WSIs and associated molecular data[54]. However, even TCGA data does not include sufficient numbers of cases for training AI applications for clinical implementation. Another potential source of datasets could be the public challenges provided for developing DL algorithms[55].

The development of AI applications with sufficient performance needs training on huge datasets demonstrating scanning[56] and staining protocol variability[56,57]. The major challenges for its implementation into practice are laboratory infrastructure and reproducibility and robustness of the AI model. Recently, automated methods for reducing blur in images have been developed. Automated algorithms (for example, HistoQC[58] and DeepFocus[59] can reportedly standardize the quality of WSIs; these AI applications automatically detects optimum quality regions and eliminates out-of-focus or artifact-related regions. Standardization of the color, displayed by histopathological slides, is important for the accuracy of AI; the color variations are often produced due to differences in batches or manufacturers of staining reagents, variations in the thickness of tissue sections, the difference in staining protocols, and disparity in scanning characteristics. These variations lead to inadequate classification by AI applications[56,60]. AI algorithms have been developed to standardize the data[61], including staining[62] and color characteristics[63].

After data set curation, the annotation of the dataset is required. Histopathological image annotation is not a simple task. The extent of annotation detail depends on the application of AI, which could vary from classification at the slide level to labeling at the pixel level. The annotation task, for many images, by human experts is time-consuming and tedious. In addition, variability in annotation performance, especially when the task is difficult, may affect the accuracy of the trained models. Moreover, for manufacturers, this task could be often expensive. Among GI pathologies, many lesions, such as intramucosal gastric carcinoma, do not have high interobserver reproducibility. When developing an AI application to assist pathologists in making a diagnosis, if the target disease shows significant interobserver variability, the correctness of the annotation of the dataset cannot be guaranteed, and the trained algorithm may not be able to reproduce performance in the dataset when used in other facilities, which may hinder its clinical implementation.

The problem of annotation in AI is an important research area. The majority of the AI models are trained using images of small tissue patches collected from WSIs. Since the patches, cropped from positive tissue, may not contain a tumor unless the tissue is filled with tumors, it is challenging to construct a high-accuracy model, particularly when pixel-level labeling is unavailable. To conduct patch-based training, without detailed annotation, multi-instance learning (MIL) algorithm can be used[64,65]. Cosatto et al[65] employed MIL for gastric cancer detection; they used over 12000 cases, 2/3rd for training and 1/3rd for the test, and achieved an AUC of 0.96. MIL is especially effective when there is a large dataset, and detailed annotations are impossible to obtain[51].

After the preparation of the annotated dataset, the model development process is usually composed of the following steps: preparation of the datasets for training, testing, and validation; selecting the ML framework, ML technique, and learning method. Once the learning process is completed, the output of the model is evaluated through performance metrics, and the hyperparameters are fine-tuned to improve performance. Considering the exponential increase in AI research for image analysis, this step does not seem to be a major obstacle to the implementation of AI in clinical practice.

Validation and regulation

As AI-based technologies grow increasingly, an evidence-based approach is required for their validation. Colling et al[47] presented summarized guidance by the current in vitro device regulation and their recommendations for the main components of validation. In laboratory medicine, apart from clinical evaluation, analytical validation should be considered[66]. The establishment of steps and criteria for the validation of new tests against existing gold standards is essential. For image analysis validation, the technique is often compared with the “ground truth” (for example, comparing an AI-technology analyzing HER2 expression within the tumor to a detailed tumor assessment performed manually). It would be appropriate to compare the digital pathology technique with the performance of human pathologists. However, considering inter- and intra-observer variability in visual assessments of human pathologists, it is difficult to identify the ground truth; thus, it involves careful designing of the study and acceptance of the limitations of the present gold standard. Currently, most AI applications seem to have difficulty in establishing absolute ground truth. Therefore, the robustness and reproducibility of AI applications should be repeatedly validated in large and variable patient cohorts.

The relative lack of a validation cohort is an urgent issue in the development of AI-based applications. Histopathological slides, with detailed clinical data linked to them, cannot be often shared widely for reasons such as privacy protection. Annotations by pathologists, which are usually considered the “ground truth”, are still controversial. Inter-observer variability and subjectivity in assessments by a pathologist indicate that a certain amount of uncertainty is inherent to ground truth. However, where the pathologist's assessment is the only available ground truth, it is important to enhance accuracy through validation as the next best measure. Efficient validation and testing require multicenter assessments involving multiple pathologists and datasets. If the AI application is intended to be used in real-life practice, it should be robust against pre-analytical variations within the target images, such as differences in staining conditions and WSI scanners, and its performance should be reproducible. With respect to this, a significant proportion of currently published AI research in GI cancers has not been externally validated.

Regulatory challenges

Appropriate regulations are required for the safe and effective use of AI in pathological practice. Unlike other laboratory tests, it is difficult to understand how predictions are made in AI applications; therefore, they are often viewed as black boxes. While various visualization techniques, including gradient saliency maps[67] and filter visualization methods, have been developed, it may not be possible for users to fully understand all the parameter changes causing erroneous performance or misprediction. Regulatory approval should be structured to minimize potential harm, define the risk-benefit balance, develop appropriate validation standards, and promote innovation[68].

Regulatory authorities, such as the FDA, the Centers for Medicare and Medicaid Services (CMS), and the European Union Conformité Européenne (EUCE) are not yet completely prepared for the implementation of AI applications in clinical medicine. As a result, AI-based devices are being controlled by prior and potentially obsolete guidelines for testing medical devices.

In the United States, the FDA is devising novel regulations for AI-based devices to make them safer and more effective[69]. CMS controls laboratory testing through the Clinical Laboratory Improvement Amendments (CLIA). CLIA stipulates that appropriate validation must be performed for all laboratory tests using human tissue before clinical implementation, regardless of their FDA approval. Currently, CLIA has no specific regulations for validating AI applications. The EUCE will replace the medical device directive in May 2021, and in vitro diagnostic medical device directives will be replaced by in vitro diagnostic regulation in May 2022[70]. Successful clinical implementation of AI-based applications will be assisted by the global market, and those clinically enforcing the applications will need to pay particular attention to the regulatory trends in their own country as well as in the US and EU. For AI applications to be approved by the FDA and EUCE, they should be established based on the updated details on FDA and EUCE regulations.

Implementation

Before implementing an AI application in real-life pathology practice, several obstacles must be addressed. Established business-use cases and a guarantee from pathologists for the use of the AI system should be accounted for before investing substantial time, energy, and funds on AI applications and required IT infrastructure.

The changes required for shifting daily workflow in the pathology department, from glass slides to WSIs, must be addressed. The department would require new digital pathology-related devices, a specific data management system, data storage facilities, and additional personnel to handle these changes. Simultaneously, an institutional IT infrastructure is required to enable users to operate through both on-site and cloud-based computing systems. Therefore, in the real-world, digital pathology systems, requiring substantial investment, may hamper the implementation of these technologies[71]. Notably, augmented microscopy, connected directly to the cloud network service, might solve the issue of whole slide scanner installation. Chen and colleagues reported the augmented reality microscope, overlaying AI-based information onto the sample-view in real-time, may enable a seamless integration of AI into the routine workflow[72]. According to Hegde et al[73], the cloud-based AI application (SMILY, Similar image search for histopathology), developed by GOOGLE, irrespective of its annotation status, allows the search for morphologically similar features in a target image.

In addition, one must consider the relative inexperience of pathologists with AI-based technologies and acknowledge the range of issues the department would encounter prior to the implementation of AI. Second, a pathologist must buy-in to make significant improvements in a conventional century-old workflow. In view of the fact that progress does not happen immediately, the pathologist's management concerns should be dealt with separately from the technological hurdles. Initially, pathologists must commit to the installation of both digital pathology systems and AI applications to a pathology department. They have to understand the long-term risk-benefit balance of AI implementation. The present DL-based AI applications lack interpretability, which may contribute to patients’ and clinicians' reluctance. Developing AI solutions that can be interpreted by end-users, thereby providing them with detailed descriptions of how their predictions are made, could be useful[74]. For lack of interpretability of DL model, various solutions, such as generating attention heat map[75], constructing interpretable model[76], creating external interpretive model[77], have been reported. However, this black box problem is not yet fully resolved.

On the downside, dependence on AI assistance for diagnoses can result in fewer opportunities for trainees to learn diagnostic skills. Although AI can be used as an auxiliary method to improve the quality and precision of clinical diagnoses, resident pathologists should be trained and encouraged to understand the utility, limitations, and pitfalls of AI application[78]. As molecular pathologists have become necessary, since the advent of genomic medicine, “computational pathologists”[47] will become necessary in the near future.

As with other clinical tests, ongoing post-marketing quality assurance is also essential for the safe and effective use of AI in clinical practice. Apart from laboratory testing processes, laboratory staff should understand the quality management system. As in conventional laboratory tests, a novel scheme of external quality assurance for AI applications in pathology should be urgently prepared for its implementation.

The use of AI applications in diagnostic practice poses complex new issues around the legal ramifications of signing a report prepared using AI by a pathologist. In order to incorporate their output into a pathological report, a pathologist should be confident in the performance of the algorithm; further, any algorithms used should be validated and regulated correctly. Although AI applications may not replace pathologists in view of this legal issue, they can be employed to support the pathologists in their clinical work. In particular, AI researchers are attempting to provide their predictions/results with confidence estimates and localize pathology-related features. This could help mitigate interpretability and confidence-building concerns.

CONCLUSION

The immense potential of AI in pathological practice can be harnessed by improving workflows, eliminating simple mistakes, increasing diagnostic reproducibility, and revealing predictions that are impossible with the use of conventional visual methods by human pathologists. The clinically implemented AI applications are expected to be user-friendly, explainable, robust, manageable, and cost-effective. Considering the current limited clinical awareness and uncertainty about how AI tools can be introduced into real-life practice, caution should be paid to their deployment. Eventually, AI applications may be implemented and used appropriately, provided they are supported by human pathologists, standardized usage recommendations, and harmonization of AI applications with present information systems.

AI can play a pivotal role in the practice of pathologists and the development of precision medicine for GI cancers. However, there are various barriers to its effective implementation. To overcome these barriers and implement AI at the practice level, it is necessary to work with a range of stakeholders, including pathologists, clinicians, developers, regulators, and device vendors, to establish a strong network to grab true needs, expand the market, and use the application safely and efficiently.

Footnotes

Manuscript source: Invited manuscript

Corresponding Author's Membership in Professional Societies: The Japanese Society of Pathologists; American Society of Clinical Oncology; and Japanese Association for Medical Artificial Intelligence.

Specialty type: Gastroenterology and hepatology

Country/Territory of origin: Japan

Peer-review report’s scientific quality classification

Grade A (Excellent): 0

Grade B (Very good): B

Grade C (Good): 0

Grade D (Fair): 0

Grade E (Poor): 0

P-Reviewer: Song B S-Editor: Gao CC L-Editor: A P-Editor: Ma YJ

References
1.  Topol EJHigh-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44-56.  [PubMed]  [DOI]
2.  Andras I, Mazzone E, van Leeuwen FWB, De Naeyer G, van Oosterom MN, Beato S, Buckle T, O'Sullivan S, van Leeuwen PJ, Beulens A, Crisan N, D'Hondt F, Schatteman P, van Der Poel H, Dell'Oglio P, Mottrie A. Artificial intelligence and robotics: a combination that is changing the operating room. World J Urol. 2020;38:2359-2366.  [PubMed]  [DOI]
3.  Mukhopadhyay S, Feldman MD, Abels E, Ashfaq R, Beltaifa S, Cacciabeve NG, Cathro HP, Cheng L, Cooper K, Dickey GE, Gill RM, Heaton RP Jr, Kerstens R, Lindberg GM, Malhotra RK, Mandell JW, Manlucu ED, Mills AM, Mills SE, Moskaluk CA, Nelis M, Patil DT, Przybycin CG, Reynolds JP, Rubin BP, Saboorian MH, Salicru M, Samols MA, Sturgis CD, Turner KO, Wick MR, Yoon JY, Zhao P, Taylor CR. Whole Slide Imaging Versus Microscopy for Primary Diagnosis in Surgical Pathology: A Multicenter Blinded Randomized Noninferiority Study of 1992 Cases (Pivotal Study). Am J Surg Pathol. 2018;42:39-52.  [PubMed]  [DOI]
4.  Niazi MKK, Parwani AV, Gurcan MN. Digital pathology and artificial intelligence. Lancet Oncol. 2019;20:e253-e261.  [PubMed]  [DOI]
5.  Bera K, Schalper KA, Rimm DL, Velcheti V, Madabhushi A. Artificial intelligence in digital pathology - new tools for diagnosis and precision oncology. Nat Rev Clin Oncol. 2019;16:703-715.  [PubMed]  [DOI]
6.  Rashidi HH, Tran NK, Betts EV, Howell LP, Green R. Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods. Acad Pathol. 2019;6:2374289519873088.  [PubMed]  [DOI]
7.  LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436-444.  [PubMed]  [DOI]
8.  Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science. 2006;313:504-507.  [PubMed]  [DOI]
9.  Jain RK, Mehta R, Dimitrov R, Larsson LG, Musto PM, Hodges KB, Ulbright TM, Hattab EM, Agaram N, Idrees MT, Badve S. Atypical ductal hyperplasia: interobserver and intraobserver variability. Mod Pathol. 2011;24:917-923.  [PubMed]  [DOI]
10.  Elmore JG, Longton GM, Carney PA, Geller BM, Onega T, Tosteson AN, Nelson HD, Pepe MS, Allison KH, Schnitt SJ, O'Malley FP, Weaver DL. Diagnostic concordance among pathologists interpreting breast biopsy specimens. JAMA. 2015;313:1122-1132.  [PubMed]  [DOI]
11.  Jiang Y, Yang M, Wang S, Li X, Sun Y. Emerging role of deep learning-based artificial intelligence in tumor pathology. Cancer Commun (Lond). 2020;40:154-166.  [PubMed]  [DOI]
12.  Calderaro J, Kather JN. Artificial intelligence-based pathology for gastrointestinal and hepatobiliary cancers. Gut. 2020;.  [PubMed]  [DOI]
13.  Niu PH, Zhao LL, Wu HL, Zhao DB, Chen YT. Artificial intelligence in gastric cancer: Application and future perspectives. World J Gastroenterol. 2020;26:5408-5419.  [PubMed]  [DOI]
14.  Thakur N, Yoon H, Chong Y. Current Trends of Artificial Intelligence for Colorectal Cancer Pathology Image Analysis: A Systematic Review. Cancers (Basel). 2020;12.  [PubMed]  [DOI]
15.  Khan A, Nawaz U, Ulhaq A, Robinson RW. Real-time plant health assessment via implementing cloud-based scalable transfer learning on AWS DeepLens. PLoS One. 2020;15:e0243243.  [PubMed]  [DOI]
16.  PathAI  PathAI Present Machine Learning Models that Predict the Homologous Recombination Deficiency Status of Breast Cancer Biopsies at the 2020 SABCS. [cited 7 January 2021]. In: PathAI [Internet]. Available from: https://www.pathai.com/news/pathai-sabcs2020.  [PubMed]  [DOI]
17.  Pantanowitz L, Sinard JH, Henricks WH, Fatheree LA, Carter AB, Contis L, Beckwith BA, Evans AJ, Lal A, Parwani AV;  College of American Pathologists Pathology and Laboratory Quality Center. Validating whole slide imaging for diagnostic purposes in pathology: guideline from the College of American Pathologists Pathology and Laboratory Quality Center. Arch Pathol Lab Med. 2013;137:1710-1722.  [PubMed]  [DOI]
18.  Cheng CL, Azhar R, Sng SH, Chua YQ, Hwang JS, Chin JP, Seah WK, Loke JC, Ang RH, Tan PH. Enabling digital pathology in the diagnostic setting: navigating through the implementation journey in an academic medical centre. J Clin Pathol. 2016;69:784-792.  [PubMed]  [DOI]
19.  Hamamoto R, Suvarna K, Yamada M, Kobayashi K, Shinkai N, Miyake M, Takahashi M, Jinnai S, Shimoyama R, Sakai A, Takasawa K, Bolatkan A, Shozu K, Dozen A, Machino H, Takahashi S, Asada K, Komatsu M, Sese J, Kaneko S. Application of Artificial Intelligence Technology in Oncology: Towards the Establishment of Precision Medicine. Cancers (Basel). 2020;12.  [PubMed]  [DOI]
20.  de Groof AJ, Struyvenberg MR, van der Putten J, van der Sommen F, Fockens KN, Curvers WL, Zinger S, Pouw RE, Coron E, Baldaque-Silva F, Pech O, Weusten B, Meining A, Neuhaus H, Bisschops R, Dent J, Schoon EJ, de With PH, Bergman JJ. Deep-Learning System Detects Neoplasia in Patients With Barrett's Esophagus With Higher Accuracy Than Endoscopists in a Multistep Training and Validation Study With Benchmarking. Gastroenterology 2020; 158: 915-929. e4.  [PubMed]  [DOI]
21.  Sharma H, Zerbe N, Klempert I, Hellwich O, Hufnagl P. Deep convolutional neural networks for automatic classification of gastric carcinoma using whole slide images in digital histopathology. Comput Med Imaging Graph. 2017;61:2-13.  [PubMed]  [DOI]
22.  Iizuka O, Kanavati F, Kato K, Rambeau M, Arihiro K, Tsuneki M. Deep Learning Models for Histopathological Classification of Gastric and Colonic Epithelial Tumours. Sci Rep. 2020;10:1504.  [PubMed]  [DOI]
23.  Yoshida H, Shimazu T, Kiyuna T, Marugame A, Yamashita Y, Cosatto E, Taniguchi H, Sekine S, Ochiai A. Automated histological classification of whole-slide images of gastric biopsy specimens. Gastric Cancer. 2018;21:249-257.  [PubMed]  [DOI]
24.  Tomita N, Abdollahi B, Wei J, Ren B, Suriawinata A, Hassanpour S. Attention-Based Deep Neural Networks for Detection of Cancerous and Precancerous Esophagus Tissue on Histopathological Slides. JAMA Netw Open. 2019;2:e1914645.  [PubMed]  [DOI]
25.  Courtiol P, Maussion C, Moarii M, Pronier E, Pilcer S, Sefta M, Manceron P, Toldo S, Zaslavskiy M, Le Stang N, Girard N, Elemento O, Nicholson AG, Blay JY, Galateau-Sallé F, Wainrib G, Clozel T. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat Med. 2019;25:1519-1525.  [PubMed]  [DOI]
26.  Kather JN, Krisam J, Charoentong P, Luedde T, Herpel E, Weis CA, Gaiser T, Marx A, Valous NA, Ferber D, Jansen L, Reyes-Aldasoro CC, Zörnig I, Jäger D, Brenner H, Chang-Claude J, Hoffmeister M, Halama N. Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study. PLoS Med. 2019;16:e1002730.  [PubMed]  [DOI]
27.  Luo X, Yin S, Yang L, Fujimoto J, Yang Y, Moran C, Kalhor N, Weissferdt A, Xie Y, Gazdar A, Minna J, Wistuba II, Mao Y, Xiao G. Development and Validation of a Pathology Image Analysis-based Predictive Model for Lung Adenocarcinoma Prognosis - A Multi-cohort Study. Sci Rep. 2019;9:6886.  [PubMed]  [DOI]
28.  Skrede OJ, De Raedt S, Kleppe A, Hveem TS, Liestøl K, Maddison J, Askautrud HA, Pradhan M, Nesheim JA, Albregtsen F, Farstad IN, Domingo E, Church DN, Nesbakken A, Shepherd NA, Tomlinson I, Kerr R, Novelli M, Kerr DJ, Danielsen HE. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. Lancet. 2020;395:350-360.  [PubMed]  [DOI]
29.  Geessink OGF, Baidoshvili A, Klaase JM, Ehteshami Bejnordi B, Litjens GJS, van Pelt GW, Mesker WE, Nagtegaal ID, Ciompi F, van der Laak JAWM. Computer aided quantification of intratumoral stroma yields an independent prognosticator in rectal cancer. Cell Oncol (Dordr). 2019;42:331-341.  [PubMed]  [DOI]
30.  García E, Hermoza R, Beltran-Castanon C, Cano L, Castillo M, Castanneda C. Automatic Lymphocyte Detection on Gastric Cancer IHC Images Using Deep Learning. IEEE. 2017;200-204.  [PubMed]  [DOI]
31.  Sirinukunwattana K, Domingo E, Richman SD, Redmond KL, Blake A, Verrill C, Leedham SJ, Chatzipli A, Hardy C, Whalley CM, Wu CH, Beggs AD, McDermott U, Dunne PD, Meade A, Walker SM, Murray GI, Samuel L, Seymour M, Tomlinson I, Quirke P, Maughan T, Rittscher J, Koelzer VH;  S:CORT consortium. Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning. Gut. 2021;70:544-554.  [PubMed]  [DOI]
32.  Fu Y, Jung AW, Torne RV, Gonzalez S, Vöhringer H, Shmatko A, Yates LR, Jimenez-Linan M, Moore L, Gerstung M. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat Cancer. 2020;1:800-810.  [PubMed]  [DOI]
33.  Kather JN, Heij LR, Grabsch HI, Loeffler C, Echle A, Muti HS, Krause J, Niehues JM, Sommer KAJ, Bankhead P, Kooreman LFS, Schulte JJ, Cipriani NA, Buelow RD, Boor P, Ortiz-Brüchle N, Hanby AM, Speirs V, Kochanny S, Patnaik A, Srisuwananukorn A, Brenner H, Hoffmeister M, van den Brandt PA, Jäger D, Trautwein C, Pearson AT, Luedde T. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat Cancer. 2020;1:789-799.  [PubMed]  [DOI]
34.  Kather JN, Pearson AT, Halama N, Jäger D, Krause J, Loosen SH, Marx A, Boor P, Tacke F, Neumann UP, Grabsch HI, Yoshikawa T, Brenner H, Chang-Claude J, Hoffmeister M, Trautwein C, Luedde T. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med. 2019;25:1054-1056.  [PubMed]  [DOI]
35.  Yoshida H, Yamashita Y, Shimazu T, Cosatto E, Kiyuna T, Taniguchi H, Sekine S, Ochiai A. Automated histological classification of whole slide images of colorectal biopsy specimens. Oncotarget. 2017;8:90719-90729.  [PubMed]  [DOI]
36.  Korbar B, Olofson AM, Miraflor AP, Nicka CM, Suriawinata MA, Torresani L, Suriawinata AA, Hassanpour S. Deep Learning for Classification of Colorectal Polyps on Whole-slide Images. J Pathol Inform. 2017;8:30.  [PubMed]  [DOI]
37.  Haj-Hassan H, Chaddad A, Harkouss Y, Desrosiers C, Toews M, Tanougast C. Classifications of Multispectral Colorectal Cancer Tissues Using Convolution Neural Network. J Pathol Inform. 2017;8:1.  [PubMed]  [DOI]
38.  Xu Y, Jia Z, Wang LB, Ai Y, Zhang F, Lai M, Chang EI. Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation features. BMC Bioinformatics. 2017;18:281.  [PubMed]  [DOI]
39.  Yoon H, Lee J, Oh JE, Kim HR, Lee S, Chang HJ, Sohn DK. Tumor Identification in Colorectal Histology Images Using a Convolutional Neural Network. J Digit Imaging. 2019;32:131-140.  [PubMed]  [DOI]
40.  Sena P, Fioresi R, Faglioni F, Losi L, Faglioni G, Roncucci L. Deep learning techniques for detecting preneoplastic and neoplastic lesions in human colorectal histological images. Oncol Lett. 2019;18:6101-6107.  [PubMed]  [DOI]
41.  Wei JW, Suriawinata AA, Vaickus LJ, Ren B, Liu X, Lisovsky M, Tomita N, Abdollahi B, Kim AS, Snover DC, Baron JA, Barry EL, Hassanpour S. Evaluation of a Deep Neural Network for Automated Classification of Colorectal Polyps on Histopathologic Slides. JAMA Netw Open. 2020;3:e203398.  [PubMed]  [DOI]
42.  Bychkov D, Linder N, Turkki R, Nordling S, Kovanen PE, Verrill C, Walliander M, Lundin M, Haglund C, Lundin J. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci Rep. 2018;8:3395.  [PubMed]  [DOI]
43.  Mandal R, Samstein RM, Lee KW, Havel JJ, Wang H, Krishna C, Sabio EY, Makarov V, Kuo F, Blecua P, Ramaswamy AT, Durham JN, Bartlett B, Ma X, Srivastava R, Middha S, Zehir A, Hechtman JF, Morris LG, Weinhold N, Riaz N, Le DT, Diaz LA Jr, Chan TA. Genetic diversity of tumors with mismatch repair deficiency influences anti-PD-1 immunotherapy response. Science. 2019;364:485-491.  [PubMed]  [DOI]
44.  Le DT, Uram JN, Wang H, Bartlett BR, Kemberling H, Eyring AD, Skora AD, Luber BS, Azad NS, Laheru D, Biedrzycki B, Donehower RC, Zaheer A, Fisher GA, Crocenzi TS, Lee JJ, Duffy SM, Goldberg RM, de la Chapelle A, Koshiji M, Bhaijee F, Huebner T, Hruban RH, Wood LD, Cuka N, Pardoll DM, Papadopoulos N, Kinzler KW, Zhou S, Cornish TC, Taube JM, Anders RA, Eshleman JR, Vogelstein B, Diaz LA Jr. PD-1 Blockade in Tumors with Mismatch-Repair Deficiency. N Engl J Med. 2015;372:2509-2520.  [PubMed]  [DOI]
45.  Lynch HT, Snyder CL, Shaw TG, Heinen CD, Hitchins MP. Milestones of Lynch syndrome: 1895-2015. Nat Rev Cancer. 2015;15:181-194.  [PubMed]  [DOI]
46.  Echle A, Grabsch HI, Quirke P, van den Brandt PA, West NP, Hutchins GGA, Heij LR, Tan X, Richman SD, Krause J, Alwers E, Jenniskens J, Offermans K, Gray R, Brenner H, Chang-Claude J, Trautwein C, Pearson AT, Boor P, Luedde T, Gaisa NT, Hoffmeister M, Kather JN. Clinical-Grade Detection of Microsatellite Instability in Colorectal Tumors by Deep Learning. Gastroenterology 2020; 159: 1406-1416. e11.  [PubMed]  [DOI]
47.  Colling R, Pitman H, Oien K, Rajpoot N, Macklin P;  CM-Path AI in Histopathology Working Group; Snead D, Sackville T, Verrill C. Artificial intelligence in digital pathology: a roadmap to routine use in clinical practice. J Pathol. 2019;249:143-150.  [PubMed]  [DOI]
48.  Steiner DF, MacDonald R, Liu Y, Truszkowski P, Hipp JD, Gammage C, Thng F, Peng L, Stumpe MC. Impact of Deep Learning Assistance on the Histopathologic Review of Lymph Nodes for Metastatic Breast Cancer. Am J Surg Pathol. 2018;42:1636-1646.  [PubMed]  [DOI]
49.  Ehteshami Bejnordi B, Veta M, Johannes van Diest P, van Ginneken B, Karssemeijer N, Litjens G, van der Laak JAWM;  the CAMELYON16 Consortium; Hermsen M, Manson QF, Balkenhol M, Geessink O, Stathonikos N, van Dijk MC, Bult P, Beca F, Beck AH, Wang D, Khosla A, Gargeya R, Irshad H, Zhong A, Dou Q, Li Q, Chen H, Lin HJ, Heng PA, Haß C, Bruni E, Wong Q, Halici U, Öner MÜ, Cetin-Atalay R, Berseth M, Khvatkov V, Vylegzhanin A, Kraus O, Shaban M, Rajpoot N, Awan R, Sirinukunwattana K, Qaiser T, Tsang YW, Tellez D, Annuscheit J, Hufnagl P, Valkonen M, Kartasalo K, Latonen L, Ruusuvuori P, Liimatainen K, Albarqouni S, Mungal B, George A, Demirci S, Navab N, Watanabe S, Seno S, Takenaka Y, Matsuda H, Ahmady Phoulady H, Kovalev V, Kalinovsky A, Liauchuk V, Bueno G, Fernandez-Carrobles MM, Serrano I, Deniz O, Racoceanu D, Venâncio R. Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer. JAMA. 2017;318:2199-2210.  [PubMed]  [DOI]
50.  Kotsenas AL, Balthazar P, Andrews D, Geis JR, Cook TS. Rethinking Patient Consent in the Era of Artificial Intelligence and Big Data. J Am Coll Radiol. 2021;18:180-184.  [PubMed]  [DOI]
51.  Campanella G, Hanna MG, Geneslaw L, Miraflor A, Werneck Krauss Silva V, Busam KJ, Brogi E, Reuter VE, Klimstra DS, Fuchs TJ. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med. 2019;25:1301-1309.  [PubMed]  [DOI]
52.  Jones AD, Graff JP, Darrow M, Borowsky A, Olson KA, Gandour-Edwards R, Datta Mitra A, Wei D, Gao G, Durbin-Johnson B, Rashidi HH. Impact of pre-analytical variables on deep learning accuracy in histopathology. Histopathology. 2019;75:39-53.  [PubMed]  [DOI]
53.  Hipp JD, Sica J, McKenna B, Monaco J, Madabhushi A, Cheng J, Balis UJ. The need for the pathology community to sponsor a whole slide imaging repository with technical guidance from the pathology informatics community. J Pathol Inform. 2011;2:31.  [PubMed]  [DOI]
54.  Cooper LA, Demicco EG, Saltz JH, Powell RT, Rao A, Lazar AJ. PanCancer insights from The Cancer Genome Atlas: the pathologist's perspective. J Pathol. 2018;244:512-524.  [PubMed]  [DOI]
55.  Hartman DJ, Van Der Laak JAWM, Gurcan MN, Pantanowitz L. Value of Public Challenges for the Development of Pathology Deep Learning Algorithms. J Pathol Inform. 2020;11:7.  [PubMed]  [DOI]
56.  Yoshida H, Yokota H, Singh R, Kiyuna T, Yamaguchi M, Kikuchi S, Yagi Y, Ochiai A. Meeting Report: The International Workshop on Harmonization and Standardization of Digital Pathology Image, Held on April 4, 2019 in Tokyo. Pathobiology. 2019;86:322-324.  [PubMed]  [DOI]
57.  Inoue T, Yagi Y. Color standardization and optimization in whole slide imaging. Clin Diagn Pathol. 2020;4.  [PubMed]  [DOI]
58.  Janowczyk A, Zuo R, Gilmore H, Feldman M, Madabhushi A. HistoQC: An Open-Source Quality Control Tool for Digital Pathology Slides. JCO Clin Cancer Inform. 2019;3:1-7.  [PubMed]  [DOI]
59.  Senaras C, Niazi MKK, Lozanski G, Gurcan MN. DeepFocus: Detection of out-of-focus regions in whole slide digital images using deep learning. PLoS One. 2018;13:e0205387.  [PubMed]  [DOI]
60.  Komura D, Ishikawa S. Machine Learning Methods for Histopathological Image Analysis. Comput Struct Biotechnol J. 2018;16:34-42.  [PubMed]  [DOI]
61.  Yagi Y, Gilbertson JR. Digital imaging in pathology: the case for standardization. J Telemed Telecare. 2005;11:109-116.  [PubMed]  [DOI]
62.  Janowczyk A, Basavanhally A, Madabhushi A. Stain Normalization using Sparse AutoEncoders (StaNoSA): Application to digital pathology. Comput Med Imaging Graph. 2017;57:50-61.  [PubMed]  [DOI]
63.  Vahadane A, Peng T, Sethi A, Albarqouni S, Wang L, Baust M, Steiger K, Schlitter AM, Esposito I, Navab N. Structure-Preserving Color Normalization and Sparse Stain Separation for Histological Images. IEEE Trans Med Imaging. 2016;35:1962-1971.  [PubMed]  [DOI]
64.  Dietterich TG, Lathrop RH, Lozano-Pérez T. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence. 1997;89:31-71.  [PubMed]  [DOI]
65.  Cosatto E, Laquerre PF, Malon C, Graf HP, Saito A, Kiyuna T, Marugame A, Kamijo K.   Automated gastric cancer diagnosis on H and E-stained sections; training a classifier on a large scale with multiple instance machine learning. Proceedings of SPIE - Progress in Biomedical Optics and Imaging, MI: 2013.  [PubMed]  [DOI]
66.  Mattocks CJ, Morris MA, Matthijs G, Swinnen E, Corveleyn A, Dequeker E, Müller CR, Pratt V, Wallace A;  EuroGentest Validation Group. A standardized framework for the validation and verification of clinical molecular genetic tests. Eur J Hum Genet. 2010;18:1276-1288.  [PubMed]  [DOI]
67.  Pasa F, Golkov V, Pfeiffer F, Cremers D, Pfeiffer D. Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization. Sci Rep. 2019;9:6268.  [PubMed]  [DOI]
68.  Allen TCRegulating Artificial Intelligence for a Successful Pathology Future. Arch Pathol Lab Med. 2019;143:1175-1179.  [PubMed]  [DOI]
69.  U.S. Food and Drug Administration  Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD). [cited 7 January 2021]. In: U.S. Food and Drug Administration [Internet]. Available from: https://www.fda.gov/media/122535/download.  [PubMed]  [DOI]
70.  European Commission  Medical Devices – Sector. [cited 7 January 2021]. In: European Commission [Internet]. Available from: https://ec.europa.eu/growth/sectors/medical-devices_en.  [PubMed]  [DOI]
71.  Retamero JA, Aneiros-Fernandez J, Del Moral RG. Complete Digital Pathology for Routine Histopathology Diagnosis in a Multicenter Hospital Network. Arch Pathol Lab Med. 2020;144:221-228.  [PubMed]  [DOI]
72.  Chen PC, Gadepalli K, MacDonald R, Liu Y, Kadowaki S, Nagpal K, Kohlberger T, Dean J, Corrado GS, Hipp JD, Mermel CH, Stumpe MC. An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Nat Med. 2019;25:1453-1457.  [PubMed]  [DOI]
73.  Hegde N, Hipp JD, Liu Y, Emmert-Buck M, Reif E, Smilkov D, Terry M, Cai CJ, Amin MB, Mermel CH, Nelson PQ, Peng LH, Corrado GS, Stumpe MC. Similar image search for histopathology: SMILY. NPJ Digit Med. 2019;2:56.  [PubMed]  [DOI]
74.  Tosun AB, Pullara F, Becich MJ, Taylor DL, Fine JL, Chennubhotla SC. Explainable AI (xAI) for Anatomic Pathology. Adv Anat Pathol. 2020;27:241-250.  [PubMed]  [DOI]
75.  Montavon G, Samek W, Müller K-R. Methods for interpreting and understanding deep neural networks. Digit Signal Process. 2018;73:1-15.  [PubMed]  [DOI]
76.  Yang JH, Wright SN, Hamblin M, McCloskey D, Alcantar MA, Schrübbers L, Lopatkin AJ, Satish S, Nili A, Palsson BO, Walker GC, Collins JJ. A White-Box Machine Learning Approach for Revealing Antibiotic Mechanisms of Action. Cell 2019; 177: 1649-1661. e9.  [PubMed]  [DOI]
77.  Kuhn DR, Kacker RN, Lei Y, Simos DE.   Combinatorial Methods for Explainable AI. Proceedings of the 2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW); 2020 Oct 24-28. IEEE, 2020: 167-170.  [PubMed]  [DOI]
78.  Arora A, Arora A. Pathology training in the age of artificial intelligence. J Clin Pathol. 2021;74:73-75.  [PubMed]  [DOI]
79.  Bollschweiler EH, Mönig SP, Hensler K, Baldus SE, Maruyama K, Hölscher AH. Artificial neural network for prediction of lymph node metastases in gastric cancer: a phase II diagnostic study. Ann Surg Oncol. 2004;11:506-511.  [PubMed]  [DOI]
80.  Duraipandian S, Sylvest Bergholt M, Zheng W, Yu Ho K, Teh M, Guan Yeoh K, Bok Yan So J, Shabbir A, Huang Z. Real-time Raman spectroscopy for in vivo, online gastric cancer diagnosis during clinical endoscopic examination. J Biomed Opt. 2012;17:081418.  [PubMed]  [DOI]
81.  Jiang Y, Xie J, Han Z, Liu W, Xi S, Huang L, Huang W, Lin T, Zhao L, Hu Y, Yu J, Zhang Q, Li T, Cai S, Li G. Immunomarker Support Vector Machine Classifier for Prediction of Gastric Cancer Survival and Adjuvant Chemotherapeutic Benefit. Clin Cancer Res. 2018;24:5574-5584.  [PubMed]  [DOI]
82.  Qu J, Hiruta N, Terai K, Nosato H, Murakawa M, Sakanashi H. Gastric Pathology Image Classification Using Stepwise Fine-Tuning for Deep Neural Networks. J Healthc Eng. 2018;2018:8961781.  [PubMed]  [DOI]
83.  León F, Gélvez M, Jaimes Z, Gelvez T, Arguello H.   Supervised Classification of Histopathological Images Using Convolutional Neuronal Networks for Gastric Cancer Detection. 2019 XXII Symposium on Image, Signal Processing and Artificial Vision (STSIVA). IEEE, 2019: 1-5.  [PubMed]  [DOI]
84.  Liang Q, Nan Y, Coppola G, Zou K, Sun W, Zhang D, Wang Y, Yu G. Weakly Supervised Biomedical Image Segmentation by Reiterative Learning. IEEE J Biomed Health Inform. 2019;23:1205-1214.  [PubMed]  [DOI]
85.  Sun M, Zhang G, Dang H, Qi X, Zhou X, Chang Q. Accurate Gastric Cancer Segmentation in Digital Pathology Images Using Deformable Convolution and Multi-Scale Embedding Networks. IEEE Access. 2019;7:75530-75541.  [PubMed]  [DOI]
86.  Wang S, Zhu Y, Yu L, Chen H, Lin H, Wan X, Fan X, Heng PA. RMDL: Recalibrated multi-instance deep learning for whole slide gastric image classification. Med Image Anal. 2019;58:101549.  [PubMed]  [DOI]
87.  Awan R, Sirinukunwattana K, Epstein D, Jefferyes S, Qidwai U, Aftab Z, Mujeeb I, Snead D, Rajpoot N. Glandular Morphometrics for Objective Grading of Colorectal Adenocarcinoma Histology Images. Sci Rep. 2017;7:16852.  [PubMed]  [DOI]
88.  Kainz P, Pfeiffer M, Urschler M. Segmentation and classification of colon glands with deep convolutional neural networks and total variation regularization. PeerJ. 2017;5:e3874.  [PubMed]  [DOI]
89.  Alom M, Yakopcic C, Taha T, Asari V.   Microscopic Nuclei Classification, Segmentation and Detection with improved Deep Convolutional Neural Network (DCNN) Approaches. 2018 Preprint. Available from: arXiv:1811.03447.  [PubMed]  [DOI]
90.  Weis CA, Kather JN, Melchers S, Al-Ahmdi H, Pollheimer MJ, Langner C, Gaiser T. Automatic evaluation of tumor budding in immunohistochemically stained colorectal carcinomas and correlation to clinical outcome. Diagn Pathol. 2018;13:64.  [PubMed]  [DOI]
91.  Ponzio F, Macii E, Ficarra E, Di Cataldo S.   Colorectal Cancer Classification using Deep Convolutional Networks - An Experimental Study. Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2. Bioimaging, 2018: 58-66.  [PubMed]  [DOI]
92.  Shapcott M, Hewitt KJ, Rajpoot N. Deep Learning With Sampling in Colon Cancer Histology. Front Bioeng Biotechnol. 2019;7:52.  [PubMed]  [DOI]
93.  Swiderska-Chadaj Z, Pinckaers H, van Rijthoven M, Balkenhol M, Melnikova M, Geessink O, Manson Q, Sherman M, Polonia A, Parry J, Abubakar M, Litjens G, van der Laak J, Ciompi F. Learning to detect lymphocytes in immunohistochemistry with deep learning. Med Image Anal. 2019;58:101547.  [PubMed]  [DOI]