Yao MJ, Xing YF, Liu SH, Peng YF, Yang SH, Chen JJ, Zhao JM, Wang H. Construction of a community-based primary screening and hospital-based confirmatory screening pathway in pediatric nonalcoholic fatty liver disease. World J Gastroenterol 2025; 31(28): 108321 [PMID: 40741471 DOI: 10.3748/wjg.v31.i28.108321]
Corresponding Author of This Article
Hui Wang, PhD, Assistant Professor, Department of Maternal and Child Health, Peking University Health Science Center, No. 38 Xueyuan Road, Haidian District, Beijing 100191, China. huiwang@bjmu.edu.cn
Research Domain of This Article
Medical Laboratory Technology
Article-Type of This Article
Observational Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Ming-Jie Yao, Department of Anatomy and Embryology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing 100191, China
Yun-Fei Xing, Shu-Han Yang, Hui Wang, Department of Maternal and Child Health, School of Public Health, Peking University Health Science Center, Beijing 100191, China
Shu-Hong Liu, Department of Pathology and Hepatology, The Fifth Medical Center of PLA General Hospital, Beijing 100191, China
Ya-Fei Peng, Department of Nursing, Nursing School of Dalian Medical University, Dalian 116044, Liaoning Province, China
Juan-Juan Chen, Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450001, Henan Province, China
Jing-Min Zhao, Department of Pathology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing 100039, China
Co-corresponding authors: Jing-Min Zhao and Hui Wang.
Author contributions: Yao MJ, Wang H, and Zhao JM conceptualized and designed the research; Xing YF, Liu SH, Peng YF, Yang SH, and Chen JJ screened patients and acquired clinical data; Xing YF performed Data analysis; Liu SH, Peng YF, Chen JJ, and Yang SH, provided statistical guidance and material support; Xing YF and Yao MJ wrote the paper; Wang H and Zhao JM supervised the study; All authors critically revised the manuscript for important intellectual content. Yao MJ and Xing YF were both responsible for data collection and writing, and made critical and indispensable contributions to the completion of the project, so they are eligible to be co-first authors of the manuscript. Wang H and Zhao JM, as co-corresponding authors, played an essential and indispensable role in study design, data interpretation, and overall supervision. Both also provided funding support for the project.
Supported by National Natural Science Foundation of China, No. 82272433; Fujian Provincial Key Laboratory of Hepatic Drug Research, No. 2022-YF-0050; the Major Science and Technology Projects for Health of Zhejiang Province, No. WKJ-ZJ-2216; and the Cyrus Tang Foundation for Young Scholar, No. 2022 (2022-B126).
Institutional review board statement: This study was approved by the Institutional Review Board of Peking University, with approval No. IRB00001052-19081, dated 2020-09-24.
Informed consent statement: This was a retrospective study and the ethics committee agreed to exempt written informed consent.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
STROBE statement: The authors have read the STROBE Statement—checklist of items, and the manuscript was prepared and revised according to the STROBE Statement—checklist of items.
Data sharing statement: Datasets related to the present study are available from the corresponding author upon reasonable request at huiwang@bjmu.edu.cn.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Hui Wang, PhD, Assistant Professor, Department of Maternal and Child Health, Peking University Health Science Center, No. 38 Xueyuan Road, Haidian District, Beijing 100191, China. huiwang@bjmu.edu.cn
Received: April 18, 2025 Revised: May 12, 2025 Accepted: July 2, 2025 Published online: July 28, 2025 Processing time: 98 Days and 4.2 Hours
Abstract
BACKGROUND
Fibrosis is a critical event in the progression of pediatric nonalcoholic fatty liver disease (NAFLD).
AIM
To develop less invasive models based on machine learning (ML) to predict significant fibrosis in Chinese NAFLD children.
METHODS
In this cross-sectional study, 222 and 101 NAFLD children with available liver biopsy data were included in the development of screening models for tertiary hospitals and community health centers, respectively. Predictive factors were selected using least absolute shrinkage and selection operator regression and stepwise logistic regression analyses. Logistic regression (LR) and other ML models were applied to construct the prediction models.
RESULTS
Simplified indicators of the ATS and BIU indices were constructed for tertiary hospitals and community health centers, respectively. When models based on the ATS and BIU parameter combinations were constructed, the random forest (RF) model demonstrated higher screening accuracy compared to the LR model (0.80 and 0.79 for the RF model and 0.72 and 0.77 for the LR model, respectively). Using cutoff values of 90% for sensitivity and 90% for specificity, the RF models could effectively identify and exclude NAFLD children with significant fibrosis in the internal validation set (with positive predictive values and negative prediction values exceeding 0.80), which could prevent liver biopsy in 60% and 71.4% of NAFLD children, respectively.
CONCLUSION
This study developed new models for predicting significant fibrosis in NAFLD children in tertiary hospitals and community health centers, which can serve as preliminary screening tools to detect the risk population in a timely manner.
Core Tip: Fibrosis is a critical event in the progression of pediatric nonalcoholic fatty liver disease (NAFLD), and practical and efficient screening indices for early detection and referral in a large population are urgently needed. Different indices were generated for tertiary hospitals and community health centers based on the data of Chinese NAFLD children individually confirmed via liver biopsy. Serial tests were able to dramatically increase the positive predictive value. The sequential implementation of these less invasive screening predictors and referral systems could help physicians in accurately detecting the risk population accurately; however, broad age/ethnic range population validation is needed.
Citation: Yao MJ, Xing YF, Liu SH, Peng YF, Yang SH, Chen JJ, Zhao JM, Wang H. Construction of a community-based primary screening and hospital-based confirmatory screening pathway in pediatric nonalcoholic fatty liver disease. World J Gastroenterol 2025; 31(28): 108321
Nonalcoholic fatty liver disease (NAFLD) refers to the accumulation of triglycerides (TGs) in hepatocytes exceeding 5% of the liver weight in the absence of excessive alcohol consumption or other known causes of liver damage. It encompasses a spectrum of liver conditions ranging from simple steatosis to nonalcoholic steatohepatitis (with or without fibrosis) and may progress to cirrhosis and hepatocellular carcinoma. NAFLD is the most common chronic liver disease in children and affects 5%-10% of the general pediatric population[1], with a prevalence reaching 43% in overweight/obese children[2]. In 2021, a meta-analysis of 35300 children revealed that the prevalence of NAFLD among the general population of children in China was 6.3%, and this prevalence was observed to be as high as 40.4% among overweight/obese children[3]. Liver fibrosis is a crucial event in NAFLD progression, and the early identification and intervention of this type of fibrosis can help in controlling its progression[4,5].
The gold standard technique for diagnosing liver fibrosis is liver biopsy. However, its invasive nature, requirement for specialized operators, and potential for complications such as pain and bleeding limit its practical application, particularly in children[6]. To date, several less invasive indices for the diagnosis of liver fibrosis have been developed for children, such as the body mass index (BMI)-z score-aspartate aminotransferase (AST), BMI-z score-AST (B-AST)-platelets ratio index, BMI-z score-fibrosis index[7], and Pediatric NAFLD Fibrosis Score (PNFS)[8]. However, most of these indices were generated based on the Caucasian population, and few less-invasive predictors have been constructed based on Chinese children who are susceptible to abdominal obesity, the prevalence of which has more than doubled to 14.5% over the past two decades[9].
Machine learning (ML), which is a subset of artificial intelligence, enables computers to learn patterns from data and make predictions. Compared with traditional analytical methods, ML offers powerful nonlinear modeling capabilities, automatic feature extraction, and the ability to continuously achieve optimization via iterative learning. With advancements in data availability and algorithmic development, ML models have been increasingly applied to complex clinical tasks. Chang et al[10] used ML methods to generate prediction models for advanced fibrosis in adults, and these models achieved better performance in comparison with traditional models.
Recently, a meta-analysis revealed that the NAFLD incidence rate was highest in mainland China[11] and NASH-related liver cancer has exhibited the fastest increase in prevalence in comparison with hepatitis B virus/hepatitis C virus/alcohol-related liver cancer[12]. Concurrently, the prevalence of overweight/obesity in Chinese children was observed to be 23.4% in 2019 and is expected to increase to 32.7% by 2030; additionally, the prevalence of overweight/obesity in rural areas is expected to surpass that in urban areas by 2025[13]. Consequently, the prevalence of obesity-related NAFLD will exhibit similar trends, which suggests that noncommunicable diseases play a dominant role in the overall disease burden among adolescents in China[14]. Due to the large population and unbalanced development of the health care system, appropriate prediction indicators that are suitable for different resource settings and cost effective should be developed and implemented.
The present study applied ML methods to construct appropriate indicators for different resource settings based on liver biopsy data from Chinese children; these methods may alter the health burdens of children and related individuals (such as parents and physicians) with respect to obesity-related disease burdens and promote the concept of “healthy weight”.
MATERIALS AND METHODS
Study design and population
The study population included 268 NAFLD children aged 2-18 years who were diagnosed at the Fifth Medical Center of the Chinese PLA General Hospital (PLA 5th) from 2011 to 2018 via liver biopsy. Participants lacking key anthropometric and laboratory data were excluded from the study. Data and information regarding missing indicators are shown in Supplementary Table 1. Due to the fact that the PLA 5th dataset is sourced from tertiary hospitals, it reflects the diagnostic capabilities of such institutions. When developing pediatric significant fibrosis screening models suitable for tertiary hospitals and with consideration of the commonly used examination indices in these settings[15,16], most of the examination variables in the dataset were initially included. After exclusion, 222 participants were included in the screening models of tertiary hospitals. In addition, early screening and diagnosis of diseases are often conducted at community health centers, where the commonly used examination indices are primarily derived from basic health monitoring programs and are relatively low-cost indices. When considering the differences in the commonly used examination indices between the two types of institutions, as well as indices that are commonly used at community health centers[17-19], 18 variables were initially included in this model. After exclusion, 101 participants were included in the screening models of community health centers. Sensitivity analysis revealed no significant difference in the characteristics of the variables after the exclusion of missing values (Supplementary Table 2). This study was approved by the Ethics Committee of Peking University (No. IRB00001052-19081). Figure 1 illustrates the detailed workflow of the study. All of the research was conducted in accordance with both the Declaration of Helsinki and Istanbul. Written consent was provided by all of the subjects.
Figure 1 Flowchart for the selection of study subjects.
The development process of the model is shown above. Regarding the left part of the figure, based on the data of 222 children with nonalcoholic fatty liver disease (NAFLD), a tertiary hospital screening model was established via variable screening and combination; regarding the right part of the figure, screening models for community health centers were developed based on the data of 101 NAFLD children using the same method. NAFLD: Nonalcoholic fatty liver disease; HDL: High-density lipoprotein cholesterol; BMI: Body mass index; UA: Uric acid; ML: Machine learning.
Anthropometric and blood measurements
Anthropometric measurements were performed by trained personnel following a standard protocol. After the participants removed their coats and shoes, height was measured to 0.1 cm using a mechanical stadiometer, and weight was measured to 0.01 kg using the InBody scale. BMI was calculated as weight/height squared (kg/m2). Blood samples were collected after the patients had fasted for at least 8 hours and were tested by the laboratory department of the PLA 5th Hospital. All of the detection procedures were performed in accordance with standard experimental procedures.
Definition of NAFLD and significant fibrosis
The scoring of liver fibrosis and necroinflammatory activity was based on histopathological examination of percutaneous liver biopsy samples. Liver fibrosis in NAFLD patients was staged according to the Brunt staging system[20] (supplementary methods in Supplementary material). In this study, significant fibrosis was defined as a score of ≥ F2.
Statistical analysis
Continuous variables are presented as the means ± SD, and categorical variables are presented as numbers (percentages). Student’s t test or the Kruskal-Wallis test was used to detect significant differences between groups for continuous variables, and the χ2 test was used for the categorical variables. Predictive factors were selected using least absolute shrinkage and selection operator regression (LASSO) regression and stepwise logistic regression (LR) analyses. ML methods were applied to construct the prediction models (Figure 2, supplementary methods in Supplementary material).
Figure 2 Graphical abstract.
ATS was a logistic regression model developed in this study that included six parameters: Alkaline phosphatase, total bile acid, aspartate aminotransferase, cholinesterase, high-density lipoprotein cholesterol, and fibrinogen. ML: Machine learning.
The bootstrap method was used to compare the area under the curve (AUC) values of the models, and the DeLong test was used for pairwise comparisons of the AUC values. All of the analyses were performed in R 4.3.1, and significance was set at a two-sided P value < 0.05.
RESULTS
The characteristics of the participants included in this study are shown in Table 1. According to the PLA 5th dataset, the mean age of the children was 11.62 ± 3.23 years, the mean BMI-z score was 2.25 ± 1.13, and 87.8% of the participants were male. There were 113 children diagnosed with significant fibrosis. Compared with those with F0-F1, NAFLD children with ≥ F2 exhibited higher BMI-z scores, as well as higher levels of inflammation (G score), serum liver enzymes [alanine aminotransferase, AST, alkaline phosphatase (ALP), and glutamyl transpeptidase], insulin, and uric acid (UA) (all P < 0.05).
Table 1 Baseline characteristics of the study population, n (%).
Thirteen variables were selected via the intersection of stepwise LR and LASSO regression methods (Supplementary Table 3). BMI exhibited strong multicollinearity with other variables (variance inflation factor = 6.68); thus, this variable was removed from the model. The multicollinearity among the remaining 12 variables is shown in Supplementary Table 4. This index was termed the ATC index (text 3 in Supplementary material). The AUC value of the ATC index in both the training set and the internal validation set was 0.80 (Supplementary Table 5), and the ATC index generally exhibited a greater AUC, accuracy and PPV compared to previous indices (P < 0.05) in the current population. The screening performance based on the variable combination of the ATC index was better for the RF and XGBoost models than for the LR models (Supplementary Table 6). Supplementary Figure 1 shows the international normalized ratio and AST, with creatinine and AST observed as being key impact factors for RF and XGBoost, respectively.
The ATC index was further simplified and is represented in Supplementary Table 7. The optimal model included the following six parameters: ALP, total bile acid (TBA), AST, cholinesterase, high-density lipoprotein cholesterol (HDL) and fibrinogen. The ATS index (ATS was a logistic regression model developed in this study that included six parameters: alkaline phosphatase, total bile acid, aspartate aminotransferase, cholinesterase, high-density lipoprotein cholesterol, and fibrinogen) was calculated as 0.959 + 0.006 × ALP (U/L) + 0.073 × TBA (μmol/L) + 0.007 × AST (U/L) - 0.001 × cholinesterase (U/L) - 2.699 × HDL (mmol/L) + 0.744 × fibrinogen (g/L). The cutoff value based on the maximum Youden index was 0.55. Moreover, the AUC value of the ATS index for predicting ≥ F2 in the validation set was 0.70, which was greater than that of previous indices (Supplementary Table 8). The prediction performance of the RF and XGBoost models was better than that of the LR model, especially with respect to the RF model, with AUC being observed at 0.80 (Table 2).
Table 2 Diagnostic performances of the machine learning models based on the ATS index for the diagnosis of ≥ F2 in the training and validation sets.
The yield of the corresponding cutoff value at 90% sensitivity or 90% specificity was greater for the ATS model than for the RF and XGBoost models (Supplementary Table 9). The proportion of patients with significant fibrosis who could not be screened via the ML models in the training dataset was 7.91%-22.03%, which was lower than that of the ATS (44.07%).
Construction of indices for community health centers
When only the routine examination indices of community health centers were included, ten variables were selected (Supplementary Table 10), with no strong multicollinearity being observed among them (Supplementary Table 11). The combined index was termed the HIU index (text 3 in Supplementary material). The AUC values of the HIU index were above 0.85 and were significantly greater than those of other indicators such as B-AST, FIB-4, and NFS, etc., (all P < 0.05; Supplementary Table 12). In addition, the AUC values of the SVM and RF models were greater than those of the LR models (Supplementary Table 13). Supplementary Figure 2 shows that insulin played an important role in the RF and XGBoost models according to the SHapley Additive exPlanations (SHAP) values.
Simplification was subsequently conducted on the HIU index, and three variables were selected, including BMI, insulin, and UA (Supplementary Table 14). The combined index was termed the BIU (logistic regression model developed in this study that included three parameters: BMI, insulin, and UA) index. The BIU index was calculated as -0.875 + 0.097 × BMI (kg/m2) + 0.063 × insulin (mU/L) - 0.008 × UA (μmol/L), and the cutoff value according to the maximum Youden index was -0.06. The AUC value of the BIU in the training set was 0.81, which was significantly greater than that of the previous indices (all P < 0.05; Supplementary Tables 12 and 15). Furthermore, the ML models that were developed based on the BIU index are shown in Table 3. Only the AUC value of the RF model was greater than that of the LR model. When the cutoff values corresponding to 90% specificity were used, the negative prediction values (NPVs) of the RF and XGBoost models in the internal validation set were 1.00 and 0.86, respectively, and the proportions of patients who could not be diagnosed were lower for these models than that of the LR model (28.57% and 9.52% vs 47.62%; Supplementary Table 16).
Table 3 Diagnostic performances of the machine learning models based on the BIU index for the diagnosis of ≥ F2 in the training and validation sets.
Using the cutoff values corresponding to 90% sensitivity, the combined LR indices achieved an NPV of 100%, thus allowing for the accurate exclusion of patients with significant fibrosis. When the cutoff values corresponding to 90% specificity were used, the combined LR indices demonstrated a PPV of 94%, thereby enabling the precise diagnosis of patients with significant fibrosis. Similar results were detected for the joint indices generated with the RF model or in combination with the LR model, in which the PPV further improved to a value of 0.95 or greater (Table 4).
Table 4 Diagnostic performance of serial tests of the BIU and ATS indices in predicting significant fibrosis.
Classification
Sensitivity
Specificity
Accuracy
PPV
NPV
Both BIU and ATS at 90%, sensitivity
LR (ATS)
1.00
0.04
0.58
0.57
1.00
+
-
LR (BIU)
+
70
11
-
19
2
Both BIU and ATS at 90%, specificity
LR (ATS)
0.28
0.98
0.59
0.94
0.52
+
-
LR (BIU)
+
17
19
-
16
50
Both BIU and ATS at 90%, sensitivity
RF (ATS)
0.95
0.31
0.67
0.64
0.82
+
-
LR (BIU)
+
42
39
-
4
17
Both BIU and ATS at 90%, specificity
RF (ATS)
0.58
0.98
0.76
0.97
0.65
+
-
LR (BIU)
+
34
2
-
32
34
Both BIU and ATS at 90%, sensitivity
RF (ATS)
+
-
RF (BIU)
+
41
11
-
5
45
0.95
0.93
0.94
0.95
0.93
Both BIU and ATS at 90%, specificity
RF (ATS)
+
-
RF (BIU)
+
58
5
-
8
31
0.96
0.93
0.95
0.95
0.94
DISCUSSION
This study developed models for predicting significant fibrosis in Chinese NAFLD children based on different detection factors, including the ATS index for tertiary hospitals and the BIU index for community health centers. These models demonstrated excellent predictive performance in both the training and validation sets, whereby they outperformed most of the previously used indicators. Notably, when the BIU (specificity = 0.9) and ATS (specificity = 0.9) indices were sequentially used, the PPV reached 94%. In addition, in populations with similar characteristics, the RF models exhibited better classification performances than did the LR models.
Considering that obesity and metabolic dysfunction are key clinical features of NAFLD, an international expert panel has proposed the utilization of a new classification in recent years known as metabolic dysfunction-associated fatty liver disease (MAFLD)[21]. Unlike NAFLD, the diagnosis of MAFLD does not require the exclusion of individuals with alcohol consumption or other chronic liver diseases, as long as hepatic steatosis is present alongside metabolic abnormalities. However, MAFLD cannot fully replace the use of NAFLD as a classification, as a subset of lean individuals may still present with hepatic steatosis without obvious metabolic dysfunction[22]. Therefore, the present study adopted the NAFLD definition to ensure broader population coverage.
Previous studies have developed various indicators for diagnosing liver fibrosis in NAFLD children, such as B-AST, PNFS, and M-FIB4. Compared with these indicators, the prediction scores proposed in this study exhibited higher AUC, accuracy and NPV values. Additionally, studies focusing on NAFLD adult populations have developed various indices for predicting liver fibrosis. Among these indices, the NFS is recommended for use in identifying liver fibrosis according to the practice guidelines of the American Association for the Study of Liver Diseases[23]; moreover, and the predictive performances of the FIB4 and the APRI indices have been widely validated for predicting different fibrosis stages in NAFLD adults[24]. In the present study, these indices were observed to predict significant fibrosis in our pediatric NAFLD patients on average, with AUC values not exceeding 0.7, which aligned with the findings of the study by He et al[25] involving 100 Chinese NAFLD children, thus implying potential discrepancies in characteristics between children and adults with liver fibrosis. Moreover, a small-sample study of 34 children and 23 adults revealed that children with chronic liver disease demonstrated higher liver enzyme levels and more pronounced inflammatory features compared to adults[26].
When the indices obtained from the PLA 5th for variable selection were used to predict significant fibrosis in NAFLD children, TBA, ALP, and AST were observed to play dominant roles. The serum TBA concentration is a crucial metabolic indicator for metabolic syndrome and liver diseases; additionally, compared with other BA parameters, the serum TBA concentration is most strongly correlated with significant fibrosis[27]. In the present study, the TBA levels in NAFLD children with ≥ F2 were significantly greater compared to the levels in children with F0-F1, which is consistent with the results of a previous case-control study[28]. A possible explanation for this result is that the elevation in the TBA level is a manifestation of the body's self-protective mechanism. An animal experiment revealed that BA can activate the expression of TGR5 in Kupffer cells, thereby reducing liver damage by inhibiting the excessive production of cytokines[29]. In addition, liver fibrosis may lead to changes in liver structure and vascular remodeling, thus triggering a decrease in liver clearance and portosystemic shunting, which subsequently reduces the liver uptake of BAs[28,30]. Multiple studies have demonstrated that ALP and AST are important predictive indices of liver fibrosis in NAFLD patients and that elevated ALP and AST levels are associated with an increased risk of liver fibrosis in NAFLD patients[31-34].
When considering routine physical examinations of community health centers, insulin and BMI played crucial roles, and insulin exhibited the highest SHAP values in the RF and XGBoost models. Moreover, there were strong associations observed between insulin resistance, obesity and liver fibrosis. Obesity and hyperinsulinemia can directly activate hepatic stellate cells or indirectly activate these cells via hepatic lipid accumulation, thereby promoting the occurrence of liver fibrosis[35,36]. In addition, there is a synergistic effect between obesity and insulin resistance, wherein obesity may activate proinflammatory M1 macrophages in adipose tissue and the release of proinflammatory cytokines, thus leading to hepatic insulin resistance and affecting the progression of liver fibrosis through insulin resistance[37,38].
Although numerous indices for predicting liver fibrosis in NAFLD patients (such as the PNFS and NFS indices) have been developed, each of these indices exhibit certain limitations, including low accuracy and high detection costs[24]. Therefore, the construction of predictive models with high accuracy based on simple parameters has been an ongoing effort, and ML models (which leverage their powerful learning capabilities and sensitivities to complex relationships), offer a viable approach for achieving this goal. Several studies have applied ML methods to predict liver fibrosis in NAFLD patients. For example, Suárez et al[39] utilized support vector machines, decision trees, and XGBoost models to predict liver fibrosis in adult patients with nonalcoholic steatohepatitis. These models demonstrated excellent predictive performances, with the AUC values for all of these models exceeding 0.85; in particular, the XGBoost model demonstrated an AUC value of 0.95. When based on the same parameter set, Chang et al[10] reported that other ML models performed better than did LR models in identifying liver fibrosis in adult NAFLD patients. This study is the first to compare the performance of the LR model and other ML models in identifying significant fibrosis in NAFLD children, and the results were consistent with those of the abovementioned studies. When a larger number of indicators were incorporated, the ML models demonstrated better predictive performances. Specifically, the RF models demonstrated AUC values that were superior to those of the LR model in both the training set and the internal validation set. Moreover, even after simplifying the indices, the other ML models exhibited better classification performances compared to the LR models in the internal validation set. Notably, when the screening models were constructed based on the BIU parameter combination, the RF model achieved an AUC value of 1 in predicting significant fibrosis in the training set. This effect may be related to the principles of the RF method, as it is an ensemble method based on decision trees that can effectively capture complex patterns and nonlinear relationships in the data. Moreover, the RF model achieved an AUC value of 0.81 in the validation set for the screening of significant fibrosis, thus indicating that the model has a certain degree of generalizability.
When considering the testing capabilities and accessibility of indicators in tertiary hospitals and community health centers, this study developed models for predicting significant fibrosis in NAFLD children at different institutions and confirmed that the RF model outperforms the LR model in classifying significant fibrosis in NAFLD children. The use of the term "NAFLD" may be controversial in English contexts because of its potential for discrimination and stigma. In recent years, it has been recommended that these terms be replaced by new terms such as "MAFLD" or "MASLD"[21,40]. However, in the Chinese context, stigmatization is almost nonexistent[41]. Additionally, given that a consensus on the various alternative terms for NAFLD has not yet been reached, screening models based on the NAFLD definition may be more appropriate and effective in practical applications. This study explored the performance of serial tests of less invasive models for pediatric significant fibrosis, which provides a pathway for referral systems.
This study has several limitations. First, this study was a single-center retrospective study, with 222 and 101 children being included in the development of screening models for tertiary hospitals and community health centers, respectively. Although this sample size meets the requirements for model development and comparison of predictive performance, a larger sample size would help to reduce overfitting and improve the model's generalization ability. Second, this study was based on a single-center retrospective design, which may introduce selection bias and limit the representativeness of the findings. Third, this study was conducted solely on a Chinese population and primarily included obese children. Finally, the screening models that were developed in this study have not been validated in external populations. Despite demonstrating good discriminative ability in cross-validation, further prospective, multicenter studies are needed to assess the generalizability, applicability, and clinical utility of these models in diverse pediatric populations.
CONCLUSION
The present study developed precise and less-invasive indices for detecting significant fibrosis in NAFLD children, including the ATS index for tertiary hospitals and the BIU index for community health centers. Sequentially using the BIU and ATS at a specificity of 90% increased the PPV to a value > 90%, which could help clinical physicians to confidentially determine the treatment of NAFLD patients.
ACKNOWLEDGEMENTS
We sincerely thank all individuals who participated in this study.
Footnotes
Provenance and peer review: Unsolicited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Gastroenterology and hepatology
Country of origin: China
Peer-review report’s classification
Scientific Quality: Grade B, Grade B
Novelty: Grade A, Grade B
Creativity or Innovation: Grade B, Grade B
Scientific Significance: Grade B, Grade B
P-Reviewer: Chen JY; Dang SS S-Editor: Li L L-Editor: A P-Editor: Wang WB
Wang Y, Yang ZR, Chen RH. [Meta-analysis of the prevalence of non-alcoholic fatty liver disease in Chinese children].Zhongguo Ertong Baojian Zazhi. 2022;30:764-769.
[PubMed] [DOI] [Full Text]
Neuberger J, Patel J, Caldwell H, Davies S, Hebditch V, Hollywood C, Hubscher S, Karkhanis S, Lester W, Roslund N, West R, Wyatt JI, Heydtmann M. Guidelines on the use of liver biopsy in clinical practice from the British Society of Gastroenterology, the Royal College of Radiologists and the Royal College of Pathology.Gut. 2020;69:1382-1403.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 257][Cited by in RCA: 228][Article Influence: 45.6][Reference Citation Analysis (0)]
Chang D, Truong E, Mena EA, Pacheco F, Wong M, Guindi M, Todo TT, Noureddin N, Ayoub W, Yang JD, Kim IK, Kohli A, Alkhouri N, Harrison S, Noureddin M. Machine learning models are superior to noninvasive tests in identifying clinically significant stages of NAFLD and NAFLD-related cirrhosis.Hepatology. 2023;77:546-557.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 1][Cited by in RCA: 41][Article Influence: 20.5][Reference Citation Analysis (0)]
Wang H, Song Y, Ma J, Ma S, Shen L, Huang Y, Thangaraju P, Basharat Z, Hu Y, Lin Y, Peden AE, Sawyer SM, Zhang H, Zou Z. Burden of non-communicable diseases among adolescents and young adults aged 10-24 years in the South-East Asia and Western Pacific regions, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019.Lancet Child Adolesc Health. 2023;7:621-635.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 15][Reference Citation Analysis (0)]
Ampuero J, Pais R, Aller R, Gallego-Durán R, Crespo J, García-Monzón C, Boursier J, Vilar E, Petta S, Zheng MH, Escudero D, Calleja JL, Aspichueta P, Diago M, Rosales JM, Caballería J, Gómez-Camarero J, Lo Iacono O, Benlloch S, Albillos A, Turnes J, Banales JM, Ratziu V, Romero-Gómez M; HEPAmet Registry. Development and Validation of Hepamet Fibrosis Scoring System-A Simple, Noninvasive Test to Identify Patients With Nonalcoholic Fatty Liver Disease With Advanced Fibrosis.Clin Gastroenterol Hepatol. 2020;18:216-225.e5.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 124][Cited by in RCA: 115][Article Influence: 23.0][Reference Citation Analysis (0)]
Chalasani N, Younossi Z, Lavine JE, Diehl AM, Brunt EM, Cusi K, Charlton M, Sanyal AJ; American Gastroenterological Association; American Association for the Study of Liver Diseases; American College of Gastroenterologyh. The diagnosis and management of non-alcoholic fatty liver disease: practice guideline by the American Gastroenterological Association, American Association for the Study of Liver Diseases, and American College of Gastroenterology.Gastroenterology. 2012;142:1592-1609.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 1226][Cited by in RCA: 1355][Article Influence: 104.2][Reference Citation Analysis (4)]
Chalasani N, Younossi Z, Lavine JE, Diehl AM, Brunt EM, Cusi K, Charlton M, Sanyal AJ; American Association for the Study of Liver Diseases; American College of Gastroenterology; American Gastroenterological Association. The diagnosis and management of non-alcoholic fatty liver disease: Practice guideline by the American Association for the Study of Liver Diseases, American College of Gastroenterology, and the American Gastroenterological Association.Am J Gastroenterol. 2012;107:811-826.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 270][Cited by in RCA: 304][Article Influence: 23.4][Reference Citation Analysis (0)]
Charoenchue P, Khorana J, Tantraworasin A, Pojchamarnwiputh S, Na Chiangmai W, Amantakul A, Chitapanarux T, Inmutto N. Simple Clinical Prediction Rules for Identifying Significant Liver Fibrosis: Evaluation of Established Scores and Development of the Aspartate Aminotransferase-Thrombocytopenia-Albumin (ATA) Score.Diagnostics (Basel). 2025;15:1119.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in RCA: 1][Reference Citation Analysis (0)]
Zhan J, Wang J, Zhang Z, Xue R, Jiang S, Liu J, Liu Y, Zhu L, Xia J, Yan X, Ding W, Zhu C, Qiu Y, Li J, Huang R, Wu C. Noninvasive diagnosis of significant liver inflammation in patients with chronic hepatitis B in the indeterminate phase.Virulence. 2023;14:2268497.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 6][Reference Citation Analysis (0)]
Rinella ME, Lazarus JV, Ratziu V, Francque SM, Sanyal AJ, Kanwal F, Romero D, Abdelmalek MF, Anstee QM, Arab JP, Arrese M, Bataller R, Beuers U, Boursier J, Bugianesi E, Byrne CD, Castro Narro GE, Chowdhury A, Cortez-Pinto H, Cryer DR, Cusi K, El-Kassas M, Klein S, Eskridge W, Fan J, Gawrieh S, Guy CD, Harrison SA, Kim SU, Koot BG, Korenjak M, Kowdley KV, Lacaille F, Loomba R, Mitchell-Thain R, Morgan TR, Powell EE, Roden M, Romero-Gómez M, Silva M, Singh SP, Sookoian SC, Spearman CW, Tiniakos D, Valenti L, Vos MB, Wong VW, Xanthakos S, Yilmaz Y, Younossi Z, Hobbs A, Villota-Rivas M, Newsome PN; NAFLD Nomenclature consensus group. A multisociety Delphi consensus statement on new fatty liver disease nomenclature.J Hepatol. 2023;79:1542-1556.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 1288][Cited by in RCA: 1259][Article Influence: 629.5][Reference Citation Analysis (1)]