Machine learning insights concerning inflammatory and liver-related risk comorbidities in non-communicable and viral diseases

doi:10.3748/wjg.v28.i44.6230

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 28, Issue 44

This Article

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (4382)

All Articles published online

The chart showing PDF series, WORD series, HTML series, Figures (1-2) series, Tables (1-2) series.

Item

Count

PDF

267

WORD

HTML

2409

Figures (1-2)

183

Tables (1-2)

202

Sum=3097

Featured Article

The chart showing Browse series, Download series.

Item

Count

Browse

189

Download

425

Sum=614

Publishing Process of This Article

Item

Count

Browse

233

Download

438

Sum=671

Nov 28, 2022 (publication date) through May 9, 2024

Times Cited of This Article

Times Cited (2)

Journal Information of This Article

Publication Name

World Journal of Gastroenterology

ISSN

1007-9327

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Review

World J Gastroenterol. Nov 28, 2022; 28(44): 6230-6248
Published online Nov 28, 2022. doi: 10.3748/wjg.v28.i44.6230

Table 1 Summary of machine learning articles studying virus and inflammatory-related liver damage

Ref.	Objective	Subjects	Variables	ML model	Performance	Observations/remarks
Fialoke et al[63]	To predict NASH in NAFLD patients	n = 108139, NASH and healthy (non-NASH) populations	Demographic data, type 2 diabetes status, and blood biomarkers	RF, XGBoosting, DT, LR	AUROC of 88% by XGBoosting	The average and maximum value of ALT appeared was the most important variable
Ma et al[64]	To predict NAFLD in the general population	n = 10508, Subjects who attended a health examination	Age, blood biomarkers, and anthropometric data	LR, RF, SVM, baggin, DT, LR, KNN, BN, hidden NB, AdaBoosting, AODE	83% accuracy, 0.878 specificity, 0.675 sensitivity, and 0.655 F-measure score by BN	BMI, TG, GGT, ALT and uric acid were the top five predictors
Yip et al[65]	To detect NAFLD for the general population	n = 500, involving NAFLD patients and healthy subjects	Demographic, clinical data and blood biomarkers	LR, RIDGE regression, AdaBoosting, DT	AUROC of 90% by AdaBoosting	ALT, HDL-c, TG, HbA1c and white blood cells to predictors
Pei et al[66]	To identify FLD in general patients	n = 3419, patients of which 845 had FLD	Age, anthropometric, and blood biomarkers	RF, ANN, KNN, XGBoosting, LDA	0.9415 accuracy, 0.9306 AUC, and 0.9091 sensitivity by XGBoosting	Uric acid, BMI, and TG were the top three risk factors
Choi et al[67]	To stage liver fibrosis	n = 7461, patients with pathologically confirmed liver fibrosis	Age, sex, clinical data, CT images, and liver fibrosis stage	CNN	Overall staging accuracy of 79.4% and an AUROC of 0.96, 0.97, and 0.95 for diagnosing significant and advanced fibrosis, and cirrhosis, respectively	The model outperformed the radiologist’s interpretation, APRI, and FIB-4 index
Chen et al[68]	To stage liver fibrosis in patients with CHB	n = 513, patients with confirmed liver fibrosis	Age, sex, CT liver images	RF, KNN, SVM, NB	0.8118-0.9125 accuracy by RF for all stages	The adopted classifiers significantly outperformed the liver fibrosis index method
Jeong et al[69]	To classify susceptible individuals for adjuvant treatment in patients with ICC after resection	n = 1421, ICC patients	Age, sex, clinical data, and blood biomarkers	DNN	AUC of 0.78	The model was found to be more accurate than the traditional AJCC stage classifier
Wübbolding et al[70]	To identify immune profiles for the prediction of early virological relapse	n = 284, patients with CHB and treated with NA antivirals	Age, sex, and analytical and blood biomarkers	KNN, RF, LR	AUC of 0.89	The combination of IL-2, MIG/CCL9, RANTES/CCL5, SCF, and TRAIL was reliable in predicting viral relapse
Hong et al[71]	To predict esophageal varices in patients with HBV related cirrhosis	n = 197, patients with HBV-related cirrhosis	PLT count, spleen width, and portal vein diameter	ANN	Sensitivity of 96.5%, specificity of 60.4%, accuracy of 86.8%	The model obtained a positive predictive value of 90.00%; and a negative predictive value of 80.85%
Zhong et al[72]	To compare the prognostic performance of ALBI and CTP grades for HCC treated with TACE combined with sorafenib as an initial treatment	n = 504, HCC patients	ALBI and CTP grades BCLC stage, clinical data and plasma α-fetoprotein	ANN	-	The ALBI grade had higher importance in survival prediction compared to the CTP one
Shi et al[73]	To predict in-hospital mortality after primary liver cancer surgery	n = 22926, HCC surgery patients	Age, sex, clinical, and hospital data	ANN, LR	97.28% of accuracy and 84.67 % of AUROC by ANN	ANN model had higher overall performance indices and accurately predicted in-hospital mortality
Shi et al[74]	To predict 5-yr mortality after surgery for HCC	n = 22926, HCC surgery patients	Age, sex, clinical, and hospital data	ANN, LR	96.57 % of accuracy and 88.51 % of AUROC by ANN	Surgeon volume was the top predictor parameter
Patnaik et al[75]	To predict liver function-related scores (MELD, APRI, CTP) using breath biomarkers	n = 28, healthy patients compared to n = 17, liver patients	Age, anthropometric data, blood biomarkers, breath analysis	LR, RF, SVR, ETR	R²values of 0.78, 0.82, and 0.85 for CTP score, APRI score, and MELD, respectively, by ETR	Isoprene, limonene and dimethyl sulfide can be potential biomarkers for liver disease
Butt et al[85]	To diagnose the stage of hepatitis C	n = 968, patients with HCV	Age, anthropometric data, blood biomarkers, and histological staging	ANN, RF, SVM, XGBoosting	98.89% precision by ANN	The model performed better than previously presented models by other authors
Wei et al[87]	To predict HBV and HCV-related hepatic fibrosis	n = 490, HBV patients; n = 254, and 230 HCV patients	Age, BMI, analytical data (FIB-4 score), and liver biopsy	GB, DT, RF	AUROC of 0.918 by GB	GB outperformed the FIB-4 predictive score
Barakat et al[89]	To predict and stage hepatic fibrosis in children with HCV	n = 166, children with CHC	Analytical data (APRI and FIB-4 scores)	RF	AUCs of 0.903 for any type of fibrosis	RF outperformed FIB-4 and APRI predictive score
Konerman et al[88]	To predict progression of HCV	n = 72683, veterans with CHC	Age, BMI, demographic, and blood biomarkers (APRI score)	CS and LGT Cox and boosting	AUROC of 0.830 and 0.77 sensitivity by LGT boosting model for 1 yr follow-up	APRI and PLT count were top predictors in the LGT boosting model
Wong et al[86]	To predict HCC in patients with CVH	n = 86804, CHV patients, of which 6821 with HCC	Age, sex, clinical data, and blood biomarkers	LR, RIDGE regression, AdaBoosting, RF, DT	AUROC of 0.992 and 0.837 by RF in training and validation cohort, respectively	ML models obtained better AUROCs than HCC traditional risk scores
Feldman et al[91]	To predict DAA therapy duration in hepatitis C	n = 3943, HCV patients with sofosbuvir/ledipasvir as the first course of DAA, of which n = 240, received the prolonged DAA treatment	Age, sex, and clinical data (including hepatitis C record data)	XGBoosting, RF, SVM	AUC of 0.745 by XGBoosting	Results showed age, comorbidity burden, and type 2 diabetes status as new predictors for DAA therapy duration
Kamboj et al[92]	To predict repurposed drugs for HCV	n = 17968, HCV molecular fingerprints	Experimentally validated small molecules from the ChEMBL database with bioactivity against HCV NS3, NS3/A4, NS5A and NS5B proteins	SVM, ANN, KNN, RF	R² value of 0.92 by SVM	Results identified more than 8 repurposed treatments anti-HCV
Tian et al[93]	To predict HBsAg seroclearance	n = 2235, patients with CHB, of which 106 achieved HBsAg seroclearance	Age, BMI, demographic and clinical data, and blood biomarkers	LR, RF, DT, XGBoosting	AUC of 0.891 by XGBoosting	Level of HBsAg followed by age and HBV DNA were the top predictors
Chen et al[94]	To predict HBV-induced HCC using quasispecies patterns of HBV	n = 307, CHB patients; n = 237, HBV-related HCC patients	rt nucleic acid and rt/s amino acid sequences	SVM, RF, KNN, LR	AUC of 0.96, and accuracy of 0.90 by RF	HBV rt gene features can efficiently discriminate HCC from CHB
Mueller-Breckenridge et al[95]	To classify HBeAg status in HBV patients using virus full-length genome quasispecies	n = 352, CHB untreated patients	Matrix of allele frequencies (0.1-0.99) and the associated HBeAg status	RF	Range balanced accuracy of 0.8-1	n1896GA, n1934AT, n1753TC mutants were the highest-ranking variables
Kayvanjoo et al[96]	To predict HCV interferon/ribavirin therapy outcome based on viral nucleotide attributes	n = 76, gene attributes	HCV nucleotide attributes	DT, SVM, NB, DNN	Accuracy of 84.17% by SVM in responder vs relapser of subtype 1b sequences	Dinucleotides UA and UU were top predictors in the combination treatment outcome
Li et al[98]	To distinguish influenza from COVID-19 patients	n = 398, COVID-19 and influenza cases	Age, sex, blood biomarkers, clinical data, and CT and X-ray scans	XGBoosting, RF, and LASSO and RIDGE regression models	AUC of 0.990, sensitivity of 92.5% and a specificity of 97.9% by XGBoosting	Age, CT scan result, and temperature were top three predictors
Bhargava et al[99]	To detect novel COVID-19 and discriminate between pneumonia	n = 31454, images acquired from nine distinct datasets of COVID-19 patients	CT or X-ray scans	KNN, SRC, ANN, SVM	99.14 of accuracy by SVM	SVM model classified with the highest recognition rate the images as normal, pneumonia, and COVID-19 positive
Bennett et al[97]	To predict early severity and clinically characterize COVID-19 patients	n = 174568, patients with a positive lab test for COVID-19	Age, sex, demographic, anthropometric and clinical data, and blood biomarkers	RF, LR, XGBoosting	AUROC of 0.87 by XGBoosting	Age, oxygen respiratory rate, and blood urea nitrogen were ranked as top predictor for severity outcome
Günster et al[100]	To identify independent risk factors for 180-d all-cause mortality in COVID-19 patients	n = 8679, hospitalized COVID-19 patients	Age, sex, BMI, and clinical data	LR	AUC of 0.81	A high BMI and age were strong risk factors for 180-d all-cause mortality, while female sex was protective
Deng et al[101]	To identify clinical indicators for COVID-19	n = 379, patients, 62 with COVID-19 and 317 with pneumonia	Age, sex, demographic and clinical data, and blood biomarkers	EBM	AUC of 0.948	Variables grouped under liver function was top the predictor category for COVID-19 prediction
Lipták et al[102]	To identify gastrointestinal predictors for the risk of COVID-19-related hospitalization	n = 680, patients	Age, sex, clinical data, and blood biomarkers	RF	AUC of 0.799	AST was top predictor for hospitalization
Elemam et al[103]	To identify immunological and clinical predictors of COVID-19 severity and sequelae	n = 37, COVID-19 patients; n = 40, controls	Age, sex, BMI, clinical data, and blood biomarkers	Stepwise linear regression	AUC of 0.93 for cytokines as predictors. AUC of 0.98 for biochemical markers as predictors	IL-6 and granzyme B were top potential predictors of liver injury in COVID-19 patients
Mashraqi et al[104]	To predict adverse effects on liver functions of COVID-19 ICU patients	n = 140, COVID-19 patients admitted to ICU	Blood biomarkers and existence of liver damage	SVM, KNN, ANN, NB, DT	AUC of 0.857 and precision of 0.95 by SVM	AST and ALT were top predictors of liver damage in these patients
Soltan et al[106]	To evaluate a laboratory-free COVID-19 triage for emergency care	n = 114957, emergency presentations prior to the global COVID-19 pandemic and n = 437, COVID-19 positive	Blood biomarkers, blood gas, and vital signs	LR, XGBoosting, RF	AUROC range of 0.9-0.94 by XGBoosting for datasets	The model could effectively triage patients presenting to hospital for COVID-19 without lab results
Gao et al[111]	To predict mortality in patients with alcoholic hepatitis	n = 210, alcoholic hepatitis patients	Age, clinical data, blood biomarkers, and omics data sets (metagenomics, lipidomics, and metabolomics)	GB, LR, SVM, RF	AUC of 0.87 by GB for 30-d mortality prediction using the dataset combining clinical data, bacteria and MetaCyc pathways and for and 90-d mortality prediction using the fungi dataset	The model performed better than the currently used MELD score

NASH: Non-alcoholic steatohepatitis; NAFLD: Non-alcoholic fatty liver disease; CHB: Chronic hepatitis B virus infection; HCC: Hepatocellular carcinoma; HCV: Hepatitis C virus; CHC: Chronic hepatitis C virus infection; CVH: Chronic viral hepatitis; RF: Random forest; DT: Decision trees; LR: Logistic regression; SVM: Support vector machine; KNN: K-nearest neighbors; BN: Bayesian network; NB: Naïve Bayes; AODE: Aggregating one-dependence estimators; FLD: Fatty liver disease; ANN: Artificial neural networks; LDA: Linear discriminant analysis; CNN: Convolutional neural network; DNN: Deep neuronal network; SRC: Sparse representative classifier; EBM: Explainable boosting machine; CS: Cross-sectional; LGT: Longitudinal; HBsAg: Hepatitis B surface antigen; HBeAg: Hepatitis B virus e antigen; BMI: Body mass index; ALT: Alanine transaminase; AST: Aspartate transaminase; APRI: Aspartate transaminase/platelet ratio index; COVID-19: Coronavirus disease 2019; CT: Computed tomography; GB: Gradient Boosting; AUC: Area under the curve; AUROC: Area under the receiver operating characteristic curve; ICU: Intensive care unit; IL-6: Interleukin 6; DAA: Direct-acting antiviral; MELD: Model for end-stage liver disease; TG: Triglycerides; HbA1c: Glycated hemoglobin A1c; ICC: Intrahepatic cholangiocarcinoma; ML: Machine learning; ETR: Extra tree regression; AJCC: American Joint Committee on Cancer; CXCL: chemokine (C-X-C motif) ligand; CCL: C-C motif chemokine ligand; SVR: Support vector regression; MIG: Monokine induced by interferon γ; SCF: Stem cell factor; TRAIL: Tumor necrosis factor-related apoptosis-inducing ligand; PLT: Platelet; GGT: Gamma-glutamyl transpeptidase; HDL-c: High density lipoprotein cholesterol; FIB-4: Fibrosis-4; HBV: Hepatitis B virus; TACE: Transarterial chemoembolization; BCLC: Barcelona Clinic Liver Cancer.

Table 2 Summary of the most repeated inputs of the machine learning models with the most repeated predictor outcomes for the four main inflammatory-related liver conditions

Inflammatory-related liver condition	Inputs	Most repeated predictors
FLD	Age, sex, blood biomarkers, and demographic, anthropometric, and clinical data	BMI, uric acid, TG, and ALT levels
Liver fibrosis	Age, sex, and CT images	Better diagnosis compared to classical methods like APRI and FIB-4 indexes
Virus-induced hepatitis	Age, sex, blood biomarkers, and demographic, anthropometric, and clinical data	AST, PLT levels, APRI index, and age
COVID-19	Age, sex, blood biomarkers, CT images, and demographic, anthropometric, and clinical data	Age, BMI, CT images, oxygen rate, AST, and ALT levels

FLD: Fatty liver disease; CT: Computed tomography; BMI: Body mass index; TG: Triglycerides; ALT: Alanine transaminase; AST: Aspartate transaminase; PLT: Platelet; APRI: Aspartate transaminase/platelet ratio index; COVID-19: Coronavirus disease 2019; FIB-4: Fibrosis-4.

Citation: Martínez JA, Alonso-Bernáldez M, Martínez-Urbistondo D, Vargas-Nuñez JA, Ramírez de Molina A, Dávalos A, Ramos-Lopez O. Machine learning insights concerning inflammatory and liver-related risk comorbidities in non-communicable and viral diseases. World J Gastroenterol 2022; 28(44): 6230-6248
URL: https://www.wjgnet.com/1007-9327/full/v28/i44/6230.htm
DOI: https://dx.doi.org/10.3748/wjg.v28.i44.6230