Minireviews Open Access
Copyright ©The Author(s) 2024. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Nephrol. Sep 25, 2024; 13(3): 97214
Published online Sep 25, 2024. doi: 10.5527/wjn.v13.i3.97214
Challenges in predictive modelling of chronic kidney disease: A narrative review
Sukhanshi Khandpur, Swasti Tiwari, Department of Molecular Medicine & Biotechnology, Sanjay Gandhi Post Graduate Institute of Medical Science, Lucknow 226014, Uttar Pradesh, India
Prabhaker Mishra, Department of Biostatistics and Health Informatics, Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow 226014, Uttar Pradesh, India
Shambhavi Mishra, Department of Statistics, University of Lucknow, Lucknow 226007, Uttar Pradesh, India
ORCID number: Sukhanshi Khandpur (0000-0002-5364-5975); Swasti Tiwari (0000-0002-1701-2636).
Author contributions: Tiwari S and Khandpur S conceptualized the review; Khandpur S designed the manuscript and wrote the initial draft; Tiwari S revised the manuscript; Mishra P and Mishra S supervised the development of the flowchart and table and gave inputs to improve the manuscript; All authors approved the manuscript.
Supported by Coord/7 (1)/CAREKD/2018/NCD-II, No. 5/4/7-12/13/NCD-II; and Senior Research Fellowship by the Indian Council of Medical Research, New Delhi, No. 3/1/2(6)/Nephro/2022-NCD-II.
Conflict-of-interest statement: All authors have no conflicts of interest to disclose.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Swasti Tiwari, FRCP, PhD, Professor, Department of Molecular Medicine & Biotechnology, Sanjay Gandhi Post Graduate Institute of Medical Science, PMSSY Building, 4th Floor, Raebareli Road, Lucknow 226014, Uttar Pradesh, India. tiwaris@sgpgi.ac.in
Received: May 26, 2024
Revised: August 27, 2024
Accepted: August 29, 2024
Published online: September 25, 2024
Processing time: 115 Days and 22.4 Hours

Abstract

The exponential rise in the burden of chronic kidney disease (CKD) worldwide has put enormous pressure on the economy. Predictive modeling of CKD can ease this burden by predicting the future disease occurrence ahead of its onset. There are various regression methods for predictive modeling based on the distribution of the outcome variable. However, the accuracy of the predictive model depends on how well the model is developed by taking into account the goodness of fit, choice of covariates, handling of covariates measured on a continuous scale, handling of categorical covariates, and number of outcome events per predictor parameter or sample size. Optimal performance of a predictive model on an independent cohort is desired. However, there are several challenges in the predictive modeling of CKD. Disease-specific methodological challenges hinder the development of a predictive model that is cost-effective and universally applicable to predict CKD onset. In this review, we discuss the advantages and challenges of various regression models available for predictive modeling and highlight those best for future CKD prediction.

Key Words: Chronic kidney disease; Predictive modelling; Regression; Statistical modelling; Methodology

Core Tip: The burden of chronic kidney disease (CKD) is growing rapidly and there is an urgent need to prevent the growth of the disease burden by identifying the individuals at high risk for the development of CKD. A broad spectrum of statistical models exist that can predict the future onset of the disease. This narrative review discusses the practical applicability of various statistical models for CKD prediction.



INTRODUCTION

The growing burden of chronic diseases calls for advanced preventive measures, proper screening, and early diagnosis to limit the economic burden. Preventive strategies through changes in lifestyle and dietary habits could limit the burden of chronic diseases. However, it is difficult to inculcate these changes, and it is a long-term process to reach targets of sustainable development goals to reduce premature mortality. Statistical methods could be effectively applied to predict the onset of these chronic conditions through well-developed and validated predictive models. Different predictive models have been developed for different chronic diseases[1,2]. However, the feasibility of applying existing models in real life with predictive accuracy and translational significance is still a major challenge among practitioners. Regional and sociodemographic differences of the individuals pose generalizability issues of the existing models. In addition, the appropriate modeling techniques, including model development and validation methods, are among the few other challenges for the practical application of the existing models.

Chronic kidney disease (CKD), among the broad spectrum of chronic diseases is on an exponential rise. Kidney Disease Improving Global Outcomes (KDIGO) guidelines define CKD as structural or functional abnormalities in the kidneys, present for > 3 mo[3]. Functional abnormalities in the kidneys can be assessed using glomerular filtration rate (GFR), which measures the rate of filtration of blood through glomeruli (network of blood vessels in kidneys). It is measured by clearance of exogenous filtration markers[4]. However, functional abnormalities in clinical practice are approximated by using estimated glomerular filtration rate (eGFR)[5]. It is calculated using serum creatinine or serum cystatin (endogenous markers) and classifies kidney function into G1–G5 categories, whereas KDIGO classification based on urine albumin-to-creatinine ratio (ACR) classifies the disease into A1–A3 categories. However, early diagnosis of CKD between stages 1 and 3 is challenging, as CKD remains asymptomatic in its early stage. Noninvasive markers show up when the majority of kidney tissue is already damaged. Thus, predictive modeling can address the issue and help ease the future CKD burden by predicting disease onset.

Several regression methods exist for the predictive modeling of the disease. The choice of the method depends on the distribution of the outcome variable and its relationship with the covariates. Nevertheless, each of the available regression methods is defined under a set of assumptions that are specific to the method under consideration. However, the extent to which the real data deviate from the defined set of assumptions poses a real challenge for statisticians. Internal validation, calibration and discrimination of the model have been suggested to be adequately considered when developing the predictive model of a disease[6,7]. The broad classification of the regression methods, based on the distribution of the outcome variable includes multiple linear regression, quantile regression, logistic regression, Poisson regression, and negative binomial regression (Figures 1 and 2). This review discusses the challenges associated with the application of these regression methods for the predictive modeling of CKD.

Figure 1
Figure 1 Selection of appropriate regression model.
Figure 2
Figure 2 Selection of appropriate regression model for chronic kidney disease. ACR: albumin-to-creatinine ratio; KDIGO: Kidney Disease Improving Global Outcomes; eGFR: Estimated glomerular filtration rate.
REGRESSION MODELS BASED ON CONTINUOUS OUTCOME VARIABLES
Simple/multiple linear regression model for CKD

Simple linear regression is the most basic regression method initially conceptualized and applied by Sir Francis Galton to solve the problem of heredity in the 19th century. The mathematical notation of the simple linear regression model is given by: E(Y|X)= µ(X)= β0 + β1(X), which is a line with intercept β0 and slope β1, with Y the outcome variable measured on a continuous scale and X the covariate.

Simple linear regression can be extended to multiple linear regression to include more than one independent variable, to model multifactorial diseases like CKD. Multiple linear regression analysis uses the ordinary least square estimation method to study the causal association between the outcome variable and the covariates[8]. Linear regression analysis relies on the basic assumption of the linear relationship between the predictor variables and the outcome; the outcome variable being measured on a continuous scale. However, as KDIGO classifies kidney disease based on eGFR and ACR categories, the application of linear regression to predict future kidney disease is irrelevant for the case of CKD. However, multiple linear regression can only be used to model changes in eGFR or ACR, which are continuous variables and also surrogate points for CKD[9]. Nevertheless, longitudinal cohort studies with longer follow-up periods are required to achieve the minimum sample size for the clinically significant decline in eGFR[10]. However, the assumption of the linear relationship between the outcome and the predictor still holds in addition to various other assumptions of heteroskedasticity (differences in variance of errors), multicollinearity (correlation between independent variables (covariates, in case of multiple linear regression), and independence of observations[8]. The concept of simple linear regression can be extended to include multiple independent variables (multiple linear regression). However, the decline in kidney function is a multifactorial condition with the probability of being skewed[11]. For example, Zhang et al[12] reported the serum stem cell factor level as a predictor of decline in kidney function using multiple linear regression. They used a single-time assessment of eGFR, unlike what is recommended by KDIGO guidelines, to assess kidney health. Similarly, Cheung et al[13] identified risk factors of incident CKD by eGFR change, contrary to KDIGO recommendation. Another study[14] applied multiple linear regression to predict urine ACR in diabetes, which could not provide information on how much risk of kidney disease (categorized as persistent ACR ≥ 30 mg/g) was estimated in individuals with diabetes. These studies indicate the limitations of using multiple linear regression to predict CKD. The other strategy would be to overcome the stringent assumptions of linear regression; for this, the quantile regression method could be an alternative for CKD prediction, as discussed in the following paragraph.

Quantile regression model for CKD

The concept of quantile regression was given by Koenker and Bassett in 1978. The mathematical model for the quantile regression to estimate the qth quantile of the outcome variable Y and covariate X: QY|X(q) = f(β,X = xi) = Xβq, where, probability (Y ≤ f(β,X = xi)) = q and β is regression coefficient, 0 ≤ q ≤ 1.

Quantile regression models the quantile of the outcome variable and thus can handle skewed distribution of kidney function decline, with the assumptions of covariates being the same[8]. As for ordinary least square regression, quantile regression minimizes the weighted distances. Additionally, it is more robust and does not make any assumption about the distribution of the outcome variable, except the continuity of the variable, and can be used to model extreme values[15,16]. However as discussed in linear regression, the issue of categorization of eGFR for KDIGO-based CKD classification cannot be neglected[12,14]. Nevertheless, it requires a larger sample size than linear regression[8].

REGRESSION MODELS BASED ON CATEGORICAL OUTCOME VARIABLES FOR CKD
Poisson regression model for CKD

Poisson regression was named after the French mathematician and physicist Siméon Denis Poisson. The Poisson regression model is given by: Yi = Log (λi) = β0 + βiXi, where observed values Yi~Poisson distribution with λ = λi, Xis are covariates, and βis are regression coefficients.

Poisson regression is used to model the variable following the Poisson distribution under the assumption of equal mean and variance of the variable[17]. It was initially developed to model discrete outcome variables (count variable) but has also been widely accepted to model dichotomous variables (variables with binary outcome). Thus, Poisson regression could be an option to model the occurrence of CKD. However, since CKD has a low yearly incidence resulting in a maximum number of nondisease cases, the distribution of the outcome variable is skewed. This violates the assumption of equivalence of mean and variance. The incidence of CKD reported to date ranges from 0.49%/year to 1.9%/year in different disease groups[13,18-22]; i.e. approximately 1 in 100 individuals followed up for a year develops CKD and most of the participants remain disease free. This confirms the skewed distribution of the data with unequal mean and variance, limiting the use of Poisson regression for the predictive modeling of CKD. Various resources suggest the use of zero-inflated Poisson regression in case of overdispersion, as observed in the case of CKD[23]. However, zero-inflated models assume the presence of two processes behind the generation of added zeros; the unexplored area of CKD[24,25]. Thus, zero-inflated models could not apply to CKD. Negative binomial regression could be a more recommended technique for the predictive modeling of CKD; however, to model such cases whether the negative binomial regression model is better than the proportional odds model is still debatable[26].

Logistic regression model for CKD

The logistic regression model was primarily developed by Joseph Berkson where the relationship between the outcome variable Y and the covariate X is given by: Logit{Y|X}= logit(P) = log = Xβ, where, P = Prob{Y = 1|X} and β is the regression coefficient.

Logistic regression models the categorical outcome variable using the method of maximum likelihood estimation[8]. The three logistic regressions, binary, ordinal (proportional odds model), and multinomial, model three different types of outcome variables: dichotomous, ordinal and nominal, respectively. The sample size required for the diagnostic models needs to be such that the predictive model does not overfit the training data and is based on the event per predictor parameter and the number of predictors[27]. CKD is a multifactorial disease with poor awareness of its risk factors, especially in low-resource settings[28-30]. Thus, larger study cohorts with longer periods of follow-up are required to predict CKD, which is a challenge for low-resource settings. Although they have a few limitations, logistic regression models with penalized predictor effects can be used to partially overcome the issue of overfitting[31]. This agrees with the evidence from the existing literature[32]. In the case of small sample studies, internal validation using bootstrapping could be preferred for robust model estimates[33]. Table 1 shows the form of hypothetical data valid to be used for logistic regression.

Table 1 Hypothetical data format for the use of logistic regression model for chronic kidney disease.
ID
Age (year)
Gender
eGFR1 (mL/min/1.73 m²)
eGFR_grade11
eGFR2 (mL/min/1.73 m²)
eGFR_grade21
Chronic kidney disease2
158088.082103.6810
2481107.51188.9020
337194.28188.1220
458093.17187.0620
553058.42351.3531
637095.731108.9010
743184.51297.2510
8490100.02197.8410
9331105.80198.7410
10530108.041104.3910
11461106.05189.0420
12590114.621106.8110
13600121.17188.7520
14400101.231103.6010
15551114.35190.5910
1655190.071119.0010
1742186.742157.5010
2843047.93355.7431
2941097.791102.7410
30301117.68177.6510
CHALLENGES ASSOCIATED WITH PREDICTIVE MODELING
Overfitting in predictive models

As stated in the previous section, the regression model developed using a small sample size is usually overoptimistic and may not perform well in external validation (performance of the developed model in an independent cohort)[8]. Adequate sample size methods have been suggested to reduce overfitting[10,27,34]. Overfitting of the model also comes into play in cases of rare diseases with lower incidence where the potential risk factors of the disease could not be accurately estimated. The duration of diabetes plays a major role in the prediction of CKD[35]. However, with the poor awareness of diabetes, the correct reporting of the duration of diabetes is the major issue that may cause overfitting of the model due to an added potential predictor with suboptimal accuracy. Furthermore, chronic diseases like CKD are complexly affected by various demographic, biochemical, environmental, genetic and lifestyle-associated factors. Thus, chronic diseases with multiple confounding factors are prone to cause overfitting in their predictive models. To overcome this, several methods of penalization have been developed that shrink the coefficients of unimportant variables close to zero and thereby reduce the overfitting of the developed model. LASSO regression, elastic net, and Ridge regression are the available penalization methods that account for the overfitting of the model[8,36]. The global shrinkage factor of 0.9 is considered optimum, with bootstrapping considered the best method to calculate shrinkage post-estimation[27]. However, shrinkage methods have also been shown to fail in cases of small sample sizes[31]. Thus, using a lesser number of predictors, meaningful derivatives (variables calculated from several variables, like body mass index using height and weight) that combine several variables, and principal component analysis to reduce the number of covariates has been suggested[27]. Similar to the prediction of CKD, the modeling of time to the occurrence of CKD also suffers the limitation of overfitting. Thus, apart from dimension reduction techniques, penalization methods such as penalized maximum likelihood for binary logistic regression, and penalized likelihood in Cox regression were observed as a better-developed and more general shrinkage method[6,37].

Loss of information due to categorization of continuous variables

KDIGO classifies CKD using eGFR or ACR; defined as persistent eGFR < 60 mL/min/1.73 m2 or ACR ≥ 30 mg/g for ≥ 3 mo[3]. The categorization of the variable measured on a continuous scale is done for the diagnosis of the disease or classification of the disease in different stages. Categorization is useful for descriptive purposes but may result in a loss of information for data analysis[8,38]. The comparison between studies can be efficiently made when the optimal cutoff point is available (as in the case of CKD); however, differences in the use of disease definitions restrict the generalization of the findings[39-41]. There is a lack of agreement between definitions of decline in renal function or CKD incidence[42,43]. In most studies, the definition of CKD was taken as per the KDIGO guidelines (based on outcome), while in some, varying units of percentage decline in eGFR or increase in ACR were used to describe kidney disease. The decline of ≥ 15%, ≥ 30% and 40% to varying units of annual decline in eGFR is being used to define decrease in kidney function[39,44-47]. Similarly, the lack of agreement between studies also exists for a persistent decline in eGFR, or increase in ACR as many studies analyzed results with single-time assessment of ACR or eGFR. These differences in disease definitions and categorizations leads to comparisons between the studies being difficult, leading to information loss and biased conclusions.

For improved CKD prediction and categorization of the parameters to diagnose CKD, KDIGO guidelines need to be followed. Furthermore, it has been suggested that studies that define decline in renal function (in the case of CKD) by their median lead to the loss of power similar to loss incurred by a loss of a third of data from small studies[38,48]. Thus, dichotomization by median also leads to false-positive results, underestimation of the extent of variability in the variable, and misclassification of individuals with similar characteristics as being different[49]. However, studies suggest using three or more categories (preferably at percentiles), so that the apparent shape of the relationship between the variables under study can be inferred[38]. Nevertheless, the use of quartiles for the categorization of continuous variables remains debatable[50].

CONCLUSION

The prediction of disease is complex and requires several factors and rigorous methodology for the predictive model to be efficient, parsimonious and generalizable to a larger population of interest. To the best of our knowledge, this is the first review discussing the broad categories of predictive modeling methods and several other challenges associated with the prediction of CKD. The review discusses the various methodological challenges associated with several statistical models to predict CKD. Following KDIGO guidelines, eGFR or ACR needs to be categorized to define CKD in clinical and epidemiological settings. Thus, regression models, that could best study categorical outcome variables is the suggestive methodology for CKD modeling. Since the categorization of eGFR or ACR could not be neglected for the diagnosis of CKD, therefore linear regression or quantile regression cannot be used for the predictive modeling of CKD. Moreover, with the low early incidence of CKD, the assumption of equidispersion of Poisson regression cannot be achieved. Nevertheless, the clinical implication of these models could be achieved only if we adhere to the clinical practice guidelines formulated by KDIGO. Thus, in light of the review of existing literature, binary logistic regression seems to be the preferred method for the predictive modeling of CKD using the method of maximum likelihood estimation. Moreover, using appropriate shrinkage methods, penalized maximum likelihood estimates can be used to account for overfitting. Moreover, with the severity of CKD and its consequences on public health and challenges associated with its prediction, we anticipate that the predictive model of CKD could be accurate and specific with the inclusion of the important demographic, biochemical and molecular markers rather than being parsimonious to control the kidney disease burden.

ACKNOWLEDGEMENTS

The authors would also like to thank Dr. Ashish Awasthi, Senior Scientist, Central Drug Research Institute, India for the motivation to write the review.

Footnotes

Provenance and peer review: Invited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Methodology

Country of origin: India

Peer-review report’s classification

Scientific Quality: Grade C

Novelty: Grade B

Creativity or Innovation: Grade B

Scientific Significance: Grade B

P-Reviewer: Jamaluddin J S-Editor: Liu JH L-Editor: Kerr C P-Editor: Yu HG

References
1.  Takura T, Hirano Goto K, Honda A. Development of a predictive model for integrated medical and long-term care resource consumption based on health behaviour: application of healthcare big data of patients with circulatory diseases. BMC Med. 2021;19:15.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 8]  [Cited by in F6Publishing: 6]  [Article Influence: 2.0]  [Reference Citation Analysis (0)]
2.  Zhang Q, Zhu Y, Yu W, Xu Z, Zhao Z, Liu S, Xin Y, Lv K. Diagnostic accuracy assessment of molecular prediction model for the risk of NAFLD based on MRI-PDFF diagnosed Chinese Han population. BMC Gastroenterol. 2021;21:88.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2]  [Cited by in F6Publishing: 7]  [Article Influence: 2.3]  [Reference Citation Analysis (0)]
3.  Kidney Disease: Improving Global Outcomes (KDIGO) Diabetes Work Group. KDIGO 2020 Clinical Practice Guideline for Diabetes Management in Chronic Kidney Disease. Kidney Int. 2020;98:S1-S115.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 525]  [Cited by in F6Publishing: 581]  [Article Influence: 145.3]  [Reference Citation Analysis (0)]
4.  Zager RA. Exogenous creatinine clearance accurately assesses filtration failure in rat experimental nephropathies. Am J Kidney Dis. 1987;10:427-430.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 13]  [Cited by in F6Publishing: 13]  [Article Influence: 0.4]  [Reference Citation Analysis (0)]
5.  Thompson LE, Joy MS. Endogenous markers of kidney function and renal drug clearance processes of filtration, secretion, and reabsorption. Curr Opin Toxicol. 2022;31.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1]  [Cited by in F6Publishing: 3]  [Article Influence: 1.5]  [Reference Citation Analysis (0)]
6.  Collins GS, Dhiman P, Ma J, Schlussel MM, Archer L, Van Calster B, Harrell FE Jr, Martin GP, Moons KGM, van Smeden M, Sperrin M, Bullock GS, Riley RD. Evaluation of clinical prediction models (part 1): from development to external validation. BMJ. 2024;384:e074819.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 3]  [Cited by in F6Publishing: 3]  [Article Influence: 3.0]  [Reference Citation Analysis (0)]
7.  Riley RD, Archer L, Snell KIE, Ensor J, Dhiman P, Martin GP, Bonnett LJ, Collins GS. Evaluation of clinical prediction models (part 2): how to undertake an external validation study. BMJ. 2024;384:e074820.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 3]  [Cited by in F6Publishing: 3]  [Article Influence: 3.0]  [Reference Citation Analysis (0)]
8.  Harrell FE  Regression Modeling Strategies. Springer Series in Statistics, 2001.  [PubMed]  [DOI]  [Cited in This Article: ]
9.  Inker LA, Collier W, Greene T, Miao S, Chaudhari J, Appel GB, Badve SV, Caravaca-Fontán F, Del Vecchio L, Floege J, Goicoechea M, Haaland B, Herrington WG, Imai E, Jafar TH, Lewis JB, Li PKT, Maes BD, Neuen BL, Perrone RD, Remuzzi G, Schena FP, Wanner C, Wetzels JFM, Woodward M, Heerspink HJL; CKD-EPI Clinical Trials Consortium. A meta-analysis of GFR slope as a surrogate endpoint for kidney failure. Nat Med. 2023;29:1867-1876.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5]  [Cited by in F6Publishing: 13]  [Article Influence: 13.0]  [Reference Citation Analysis (0)]
10.  Riley RD, Snell KIE, Ensor J, Burke DL, Harrell FE Jr, Moons KGM, Collins GS. Minimum sample size for developing a multivariable prediction model: Part I-Continuous outcomes. Stat Med. 2019;38:1262-1275.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 126]  [Cited by in F6Publishing: 123]  [Article Influence: 24.6]  [Reference Citation Analysis (0)]
11.  Rosansky SJ, Glassock RJ. Is a decline in estimated GFR an appropriate surrogate end point for renoprotection drug trials? Kidney Int. 2014;85:723-727.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 32]  [Cited by in F6Publishing: 33]  [Article Influence: 3.3]  [Reference Citation Analysis (0)]
12.  Zhang W, Jia L, Liu DLX, Chen L, Wang Q, Song K, Nie S, Ma J, Chen X, Xiu M, Gao M, Zhao D, Zheng Y, Duan S, Dong Z, Li Z, Wang P, Fu B, Cai G, Sun X, Chen X. Serum Stem Cell Factor Level Predicts Decline in Kidney Function in Healthy Aging Adults. J Nutr Health Aging. 2019;23:813-820.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 9]  [Cited by in F6Publishing: 5]  [Article Influence: 1.0]  [Reference Citation Analysis (0)]
13.  Cheung KL, Crews DC, Cushman M, Yuan Y, Wilkinson K, Long DL, Judd SE, Shlipak MG, Ix JH, Bullen AL, Warnock DG, Gutiérrez OM. Risk Factors for Incident CKD in Black and White Americans: The REGARDS Study. Am J Kidney Dis. 2023;82:11-21.e1.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 4]  [Reference Citation Analysis (0)]
14.  Huang LY, Chen FY, Jhou MJ, Kuo CH, Wu CZ, Lu CH, Chen YL, Pei D, Cheng YF, Lu CJ. Comparing Multiple Linear Regression and Machine Learning in Predicting Diabetic Urine Albumin-Creatinine Ratio in a 4-Year Follow-Up Study. J Clin Med. 2022;11.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 7]  [Cited by in F6Publishing: 6]  [Article Influence: 3.0]  [Reference Citation Analysis (0)]
15.  Lê Cook B, Manning WG. Thinking beyond the mean: a practical guide for using quantile regression methods for health services research. Shanghai Arch Psychiatry. 2013;25:55-59.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in F6Publishing: 41]  [Reference Citation Analysis (0)]
16.  Marrie RA, Dawson NV, Garland A. Quantile regression and restricted cubic splines are useful for exploring relationships between continuous variables. J Clin Epidemiol. 2009;62:511-7.e1.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 170]  [Cited by in F6Publishing: 166]  [Article Influence: 11.1]  [Reference Citation Analysis (0)]
17.  Coxe S, West SG, Aiken LS. The analysis of count data: a gentle introduction to poisson regression and its alternatives. J Pers Assess. 2009;91:121-136.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 555]  [Cited by in F6Publishing: 449]  [Article Influence: 29.9]  [Reference Citation Analysis (0)]
18.  Kampmann JD, Heaf JG, Mogensen CB, Mickley H, Wolff DL, Brandt F. Prevalence and incidence of chronic kidney disease stage 3-5 - results from KidDiCo. BMC Nephrol. 2023;24:17.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 15]  [Cited by in F6Publishing: 11]  [Article Influence: 11.0]  [Reference Citation Analysis (0)]
19.  Barkas F, Elisaf M, Liberopoulos E, Kalaitzidis R, Liamis G. Uric acid and incident chronic kidney disease in dyslipidemic individuals. Curr Med Res Opin. 2018;34:1193-1199.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 17]  [Cited by in F6Publishing: 14]  [Article Influence: 2.3]  [Reference Citation Analysis (0)]
20.  Pongpirul W, Pongpirul K, Ananworanich J, Klinbuayaem V, Avihingsanon A, Prasithsirikul W. Chronic kidney disease incidence and survival of Thai HIV-infected patients. AIDS. 2018;32:393-398.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 8]  [Cited by in F6Publishing: 8]  [Article Influence: 1.3]  [Reference Citation Analysis (0)]
21.  Agarwal R, Song RJ, Vasan RS, Xanthakis V. Left Ventricular Mass and Incident Chronic Kidney Disease. Hypertension. 2020;75:702-706.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5]  [Cited by in F6Publishing: 11]  [Article Influence: 2.8]  [Reference Citation Analysis (0)]
22.  Yu MK, Katon W, Young BA. Associations between sex and incident chronic kidney disease in a prospective diabetic cohort. Nephrology (Carlton). 2015;20:451-458.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 32]  [Cited by in F6Publishing: 37]  [Article Influence: 4.6]  [Reference Citation Analysis (0)]
23.  Yang Z, Hardin JW, Addy CL. Testing overdispersion in the zero-inflated Poisson model. J Stat Plan Infer. 2009;139:3340-3353.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 31]  [Cited by in F6Publishing: 26]  [Article Influence: 1.7]  [Reference Citation Analysis (0)]
24.  Paulo Fávero L  Count Data Regression Analysis: Concepts, Overdispersion Detection, Zero-inflation Identification, and Applications with R Detection, Zero-inflation Identification, and Applications with R. 2021. Available from: https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1488&context=pare.  [PubMed]  [DOI]  [Cited in This Article: ]
25.  Moriña D, Puig P, Navarro A. Analysis of zero inflated dichotomous variables from a Bayesian perspective: application to occupational health. BMC Med Res Methodol. 2021;21:277.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1]  [Cited by in F6Publishing: 1]  [Article Influence: 0.3]  [Reference Citation Analysis (0)]
26.  Fernandez GA, Vatcheva KP. A comparison of statistical methods for modeling count data with an application to hospital length of stay. BMC Med Res Methodol. 2022;22:211.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in F6Publishing: 3]  [Reference Citation Analysis (0)]
27.  Riley RD, Snell KI, Ensor J, Burke DL, Harrell FE Jr, Moons KG, Collins GS. Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes. Stat Med. 2019;38:1276-1296.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 438]  [Cited by in F6Publishing: 478]  [Article Influence: 95.6]  [Reference Citation Analysis (0)]
28.  Chen TK, Knicely DH, Grams ME. Chronic Kidney Disease Diagnosis and Management: A Review. JAMA. 2019;322:1294-1304.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 745]  [Cited by in F6Publishing: 762]  [Article Influence: 152.4]  [Reference Citation Analysis (0)]
29.  Khandpur S, Bhardwaj M, Awasthi A, Newtonraj A, Purty AJ, Khanna T, Abraham G, Tiwari S. Association of kidney functions with a cascade of care for diabetes and hypertension in two geographically distinct Indian cohorts. Diabetes Res Clin Pract. 2021;176:108861.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1]  [Cited by in F6Publishing: 1]  [Article Influence: 0.3]  [Reference Citation Analysis (0)]
30.  Flood D, Seiglie JA, Dunn M, Tschida S, Theilmann M, Marcus ME, Brian G, Norov B, Mayige MT, Singh Gurung M, Aryal KK, Labadarios D, Dorobantu M, Silver BK, Bovet P, Adelin Jorgensen JM, Guwatudde D, Houehanou C, Andall-Brereton G, Quesnel-Crooks S, Sturua L, Farzadfar F, Saeedi Moghaddam S, Atun R, Vollmer S, Bärnighausen TW, Davies JI, Wexler DJ, Geldsetzer P, Rohloff P, Ramírez-Zea M, Heisler M, Manne-Goehler J. The state of diabetes treatment coverage in 55 low-income and middle-income countries: a cross-sectional study of nationally representative, individual-level data in 680 102 adults. Lancet Healthy Longev. 2021;2:e340-e351.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 29]  [Cited by in F6Publishing: 17]  [Article Influence: 5.7]  [Reference Citation Analysis (0)]
31.  Riley RD, Snell KIE, Martin GP, Whittle R, Archer L, Sperrin M, Collins GS. Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small. J Clin Epidemiol. 2021;132:88-96.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 50]  [Cited by in F6Publishing: 49]  [Article Influence: 16.3]  [Reference Citation Analysis (0)]
32.  Nusinovici S, Tham YC, Chak Yan MY, Wei Ting DS, Li J, Sabanayagam C, Wong TY, Cheng CY. Logistic regression was as good as machine learning for predicting major chronic diseases. J Clin Epidemiol. 2020;122:56-69.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 96]  [Cited by in F6Publishing: 143]  [Article Influence: 35.8]  [Reference Citation Analysis (0)]
33.  Steyerberg EW, Bleeker SE, Moll HA, Grobbee DE, Moons KG. Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol. 2003;56:441-447.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 374]  [Cited by in F6Publishing: 393]  [Article Influence: 18.7]  [Reference Citation Analysis (0)]
34.  Riley RD, Ensor J, Snell KIE, Harrell FE Jr, Martin GP, Reitsma JB, Moons KGM, Collins G, van Smeden M. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020;368:m441.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 964]  [Cited by in F6Publishing: 814]  [Article Influence: 203.5]  [Reference Citation Analysis (1)]
35.  Fenta ET, Eshetu HB, Kebede N, Bogale EK, Zewdie A, Kassie TD, Anagaw TF, Mazengia EM, Gelaw SS. Prevalence and predictors of chronic kidney disease among type 2 diabetic patients worldwide, systematic review and meta-analysis. Diabetol Metab Syndr. 2023;15:245.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in F6Publishing: 1]  [Reference Citation Analysis (0)]
36.  Cessie SL, Houwelingen JCV. Ridge Estimators in Logistic Regression. Appl Stat. 1992;41:191.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 919]  [Cited by in F6Publishing: 920]  [Article Influence: 28.8]  [Reference Citation Analysis (0)]
37.  Verweij PJ, Van Houwelingen HC. Penalized likelihood in Cox regression. Stat Med. 1994;13:2427-2436.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 113]  [Cited by in F6Publishing: 118]  [Article Influence: 3.9]  [Reference Citation Analysis (0)]
38.  Altman DG  Categorizing Continuous Variables. Wiley StatsRef: Statistics Reference Online, 2014.  [PubMed]  [DOI]  [Cited in This Article: ]
39.  Masrouri S, Alijanzadeh D, Amiri M, Azizi F, Hadaegh F. Predictors of decline in kidney function in the general population: a decade of follow-up from the Tehran Lipid and Glucose Study. Ann Med. 2023;55:2216020.  [PubMed]  [DOI]  [Cited in This Article: ]  [Reference Citation Analysis (0)]
40.  Baba M, Shimbo T, Horio M, Ando M, Yasuda Y, Komatsu Y, Masuda K, Matsuo S, Maruyama S. Longitudinal Study of the Decline in Renal Function in Healthy Subjects. PLoS One. 2015;10:e0129036.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 62]  [Cited by in F6Publishing: 86]  [Article Influence: 9.6]  [Reference Citation Analysis (0)]
41.  Lin CC, Niu MJ, Li CI, Liu CS, Lin CH, Yang SY, Li TC. Development and validation of a risk prediction model for chronic kidney disease among individuals with type 2 diabetes. Sci Rep. 2022;12:4794.  [PubMed]  [DOI]  [Cited in This Article: ]  [Reference Citation Analysis (0)]
42.  Levey AS, Eckardt KU, Tsukamoto Y, Levin A, Coresh J, Rossert J, De Zeeuw D, Hostetter TH, Lameire N, Eknoyan G. Definition and classification of chronic kidney disease: a position statement from Kidney Disease: Improving Global Outcomes (KDIGO). Kidney Int. 2005;67:2089-2100.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2145]  [Cited by in F6Publishing: 2380]  [Article Influence: 125.3]  [Reference Citation Analysis (0)]
43.  Levey AS, Eckardt KU, Dorman NM, Christiansen SL, Hoorn EJ, Ingelfinger JR, Inker LA, Levin A, Mehrotra R, Palevsky PM, Perazella MA, Tong A, Allison SJ, Bockenhauer D, Briggs JP, Bromberg JS, Davenport A, Feldman HI, Fouque D, Gansevoort RT, Gill JS, Greene EL, Hemmelgarn BR, Kretzler M, Lambie M, Lane PH, Laycock J, Leventhal SE, Mittelman M, Morrissey P, Ostermann M, Rees L, Ronco P, Schaefer F, St Clair Russell J, Vinck C, Walsh SB, Weiner DE, Cheung M, Jadoul M, Winkelmayer WC. Nomenclature for kidney function and disease: report of a Kidney Disease: Improving Global Outcomes (KDIGO) Consensus Conference. Kidney Int. 2020;97:1117-1129.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 351]  [Cited by in F6Publishing: 396]  [Article Influence: 99.0]  [Reference Citation Analysis (0)]
44.  Hayashi K, Takayama M, Abe T, Kanda T, Hirose H, Shimizu-Hirota R, Shiomi E, Iwao Y, Itoh H. Investigation of Metabolic Factors Associated with eGFR Decline Over 1 Year in a Japanese Population without CKD. J Atheroscler Thromb. 2017;24:863-875.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 7]  [Cited by in F6Publishing: 13]  [Article Influence: 1.9]  [Reference Citation Analysis (0)]
45.  Ataga KI, Zhou Q, Derebail VK, Saraf SL, Hankins JS, Loehr LR, Garrett ME, Ashley-Koch AE, Cai J, Telen MJ. Rapid decline in estimated glomerular filtration rate in sickle cell anemia: results of a multicenter pooled analysis. Haematologica. 2021;106:1749-1753.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5]  [Cited by in F6Publishing: 13]  [Article Influence: 4.3]  [Reference Citation Analysis (0)]
46.  Zhang Z, He P, Liu M, Zhou C, Liu C, Li H, Zhang Y, Li Q, Ye Z, Wu Q, Wang G, Liang M, Qin X. Association of Depressive Symptoms with Rapid Kidney Function Decline in Adults with Normal Kidney Function. Clin J Am Soc Nephrol. 2021;16:889-897.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 15]  [Cited by in F6Publishing: 23]  [Article Influence: 7.7]  [Reference Citation Analysis (0)]
47.  Grams ME, Brunskill NJ, Ballew SH, Sang Y, Coresh J, Matsushita K, Surapaneni A, Bell S, Carrero JJ, Chodick G, Evans M, Heerspink HJL, Inker LA, Iseki K, Kalra PA, Kirchner HL, Lee BJ, Levin A, Major RW, Medcalf J, Nadkarni GN, Naimark DMJ, Ricardo AC, Sawhney S, Sood MM, Staplin N, Stempniewicz N, Stengel B, Sumida K, Traynor JP, van den Brand J, Wen CP, Woodward M, Yang JW, Wang AY, Tangri N; CKD Prognosis Consortium. Development and Validation of Prediction Models of Adverse Kidney Outcomes in the Population With and Without Diabetes. Diabetes Care. 2022;45:2055-2063.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 20]  [Cited by in F6Publishing: 21]  [Article Influence: 10.5]  [Reference Citation Analysis (0)]
48.  Cohen J. The Cost of Dichotomization. Appl Psychol Meas. 1983;7:249-253.  [PubMed]  [DOI]  [Cited in This Article: ]
49.  Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332:1080.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1395]  [Cited by in F6Publishing: 1540]  [Article Influence: 85.6]  [Reference Citation Analysis (0)]
50.  Bennette C, Vickers A. Against quantiles: categorization of continuous variables in epidemiologic research, and its discontents. BMC Med Res Methodol. 2012;12:21.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 218]  [Cited by in F6Publishing: 268]  [Article Influence: 22.3]  [Reference Citation Analysis (0)]