Retrospective Study
Copyright ©The Author(s) 2023. Published by Baishideng Publishing Group Inc. All rights reserved.
Artif Intell Gastroenterol. Jun 8, 2023; 4(1): 10-27
Published online Jun 8, 2023. doi: 10.35712/aig.v4.i1.10
Risk factor profiles for gastric cancer prediction with respect to Helicobacter pylori: A study of a tertiary care hospital in Pakistan
Shahid Aziz, Simone König, Muhammad Umer, Tayyab Saeed Akhter, Shafqat Iqbal, Maryum Ibrar, Tofeeq Ur-Rehman, Tanvir Ahmad, Alfizah Hanafiah, Rabaab Zahra, Faisal Rasheed
Shahid Aziz, Tanvir Ahmad, Faisal Rasheed, Patients Diagnostic Lab, Isotope Application Division, Pakistan Institute of Nuclear Science and Technology, Islamabad 44000, Pakistan
Shahid Aziz, Rabaab Zahra, Department of Microbiology, Quaid-i-Azam University, Islamabad 45320, Pakistan
Shahid Aziz, Simone König, Interdisciplinary Center for Clinical Research, Core Unit Proteomics, University of Münster, Münster 48149, Germany
Muhammad Umer, Management Information System Division, Pakistan Institute of Nuclear Science and Technology, Islamabad 44000, Pakistan
Tayyab Saeed Akhter, Shafqat Iqbal, Centre for Liver and Digestive Diseases, Holy Family Hospital, Rawalpindi 46300, Pakistan
Maryum Ibrar, Pakistan Scientific and Technological Information Centre, Quaid-i-Azam University, Islamabad 45320, Pakistan
Tofeeq Ur-Rehman, Department of Pharmacy, Quaid-i-Azam University, Islamabad 45320, Pakistan
Alfizah Hanafiah, Faculty of Medicine, Department of Medical Microbiology and Immunology, Universiti Kebangsan Malaysia, Cheras, Kuala Lumpur 56000, Malaysia
Author contributions: Rasheed F and Aziz S contributed to conceptualization; Aziz S contributed to methodology; Umer M contributed to software; Aziz S contributed to validation; Aziz S, König S and Ibrar M contributed to formal analysis; Rasheed F contributed to resources; Akhter ST and Iqbal S contributed to endoscopic procedures; Aziz S and König S contributed to writing – original draft preparation; Aziz S, König S, Ahmad T, Rasheed F, Hanafia A, and Rehman UT contributed to writing – review & editing; König S contributed to visualization; Zahra R and Rasheed F contributed to supervision; Aziz S and Rasheed F contributed to project administration; Aziz S and Rasheed F contributed to funding acquisition.
Institutional review board statement: Ethical approvals were granted from the Ethical Technical Committee, Pakistan Institute of Nuclear Science and Technology (PINSTECH), Islamabad (Ref.-No. PINST/DC-26/2017), the Bioethics Committee, Quaid-i-Azam University, Islamabad, Pakistan (Ref.-No. BBC-FBS-QAU2019-159), and the Institutional Research Forum, Holy Family Hospital, Rawalpindi Medical University, Rawalpindi (Ref.-No. R-40/RMU).
Informed consent statement: The investigators explain the study to each patient and informed written consent was obtained to participate in this research and their clinical data was collected during interview using a questionnaire after endoscopic evaluation. Moreover, patients were also required to give informed consent to the study for analysis and publication of their anonymous clinical data.
Conflict-of-interest statement: All the authors declare no conflict of interest.
Data sharing statement: All the data has been shared in supplementary files.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:
Corresponding author: Shahid Aziz, PhD, Research Fellow, Patients Diagnostic Lab, Isotope Application Division, Pakistan Institute of Nuclear Science and Technology, Nilore, Islamabad 44000, Pakistan.
Received: December 14, 2022
Peer-review started: December 14, 2022
First decision: January 22, 2023
Revised: April 1, 2023
Accepted: April 20, 2023
Article in press: April 20, 2023
Published online: June 8, 2023

Gastric cancer (GC) is the fourth leading cause of cancer-related deaths worldwide. Diagnosis relies on histopathology and the number of endoscopies is increasing. Helicobacter pylori (H. pylori) infection is a major risk factor.


To develop an in-silico GC prediction model to reduce the number of diagnostic surgical procedures. The meta-data of patients with gastroduodenal symptoms, risk factors associated with GC, and H. pylori infection status from Holy Family Hospital Rawalpindi, Pakistan, were used with machine learning.


A cohort of 341 patients was divided into three groups [normal gastric mucosa (NGM), gastroduodenal diseases (GDD), and GC]. Information associated with socioeconomic and demographic conditions and GC risk factors was collected using a questionnaire. H. pylori infection status was determined based on urea breath test. The association of these factors and histopathological grades was assessed statistically. K-Nearest Neighbors and Random Forest (RF) machine learning models were tested.


This study reported an overall frequency of 64.2% (219/341) of H. pylori infection among enrolled subjects. It was higher in GC (74.2%, 23/31) as compared to NGM and GDD and higher in males (54.3%, 119/219) as compared to females. More abdominal pain (72.4%, 247/341) was observed than other clinical symptoms including vomiting, bloating, acid reflux and heartburn. The majority of the GC patients experienced symptoms of vomiting (91%, 20/22) with abdominal pain (100%, 22/22). The multinomial logistic regression model was statistically significant and correctly classified 80% of the GDD/GC cases. Age, income level, vomiting, bloating and medication had significant association with GDD and GC. A dynamic RF GC-predictive model was developed, which achieved > 80% test accuracy.


GC risk factors were incorporated into a computer model to predict the likelihood of developing GC with high sensitivity and specificity. The model is dynamic and will be further improved and validated by including new data in future research studies. Its use may reduce unnecessary endoscopic procedures. It is freely available.

Keywords: Gastric cancer, Gastritis, Machine learning, Prediction model, Helicobacter pylori

Core Tip: This is a retrospective study to report the prevalence of Helicobacter pylori (H. pylori) infection in Pakistan along with its association with various risk factors having direct or indirect relationships with different gastroduodenal diseases (GDD) such as gastritis, ulcers, and gastric cancer (GC). GC risk factors were incorporated into a highly sensitive and specific dynamic computer tool for the prediction of GC with an impressive > 80% confidence. This GC prediction model is freely available and may be used to reduce unnecessary invasive procedures such as endoscopies. The research study assists the healthcare authorities in their understanding of the burden of GDD and GC, which is intertwined with H. pylori infection.