Observational Study Open Access
Copyright ©The Author(s) 2025. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Psychiatry. Aug 19, 2025; 15(8): 106622
Published online Aug 19, 2025. doi: 10.5498/wjp.v15.i8.106622
Machine learning-based nomogram for predicting depressive symptoms in women: A cross-sectional study in Guangdong Province, China
Jia-Min Chen, Yu-Ting Wei, Qiong-Gui Zhou, Jun-Long Tao, Bo Bi, School of Public Health, Hainan Medical University, Hainan Academy of Medical Science, Haikou 571199, Hainan Province, China
Mei Rao, Department of Pharmacy, Longyan First Hospital Affiliated to Fujian Medical University, Longyan 364000, Fujian Province, China
Shi-Bin Wang, Guangdong Mental Health Center, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510000, Guangdong Province, China
ORCID number: Shi-Bin Wang (0000-0002-0354-0008); Bo Bi (0000-0001-5563-4025).
Co-first authors: Jia-Min Chen and Mei Rao.
Co-corresponding authors: Shi-Bin Wang and Bo Bi.
Author contributions: Chen JM and Rao M collected the data, wrote the first version of manuscript, and made equal contributions as co-first authors; Wei YT, Zhou QG, and Tao JL contributed to the data treat and analysis; Wang SB and Bi B designed the study, supervised the study, revised the paper, and made equal contributions as co-corresponding authors. All authors agreed to publish the manuscript.
Supported by Longyan City Science and Technology Plan Project, No. 2024 LYF17067.
Institutional review board statement: The study protocol was approved by the Research Ethics Committee of the Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, No. GDREC2018543H(R1).
Informed consent statement: All participants provided written informed consent prior to the survey.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
STROBE statement: The authors have read the STROBE Statement-checklist of items, and the manuscript was prepared and revised according to the STROBE Statement-checklist of items.
Data sharing statement: Data referenced in this study are available in the Guangdong Provincial Sleep and Psychosomatic Health Survey database. The datasets analyzed during the current study are available from the corresponding author on reasonable request. The complete data are not publicly available due to them containing information that could compromise research participant privacy or consent. The code related to this study has been uploaded to GitHub: https://github.com/KMccn/Machine-Learning-Based-Nomogram-for-Predicting-Depressive-Symptoms.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Bo Bi, School of Public Health, Hainan Medical University, Xueyuan Road, Longhua District, Haikou 571199, Hainan Province, China. bibo@hainmc.edu.cn
Received: March 4, 2025
Revised: April 17, 2025
Accepted: June 25, 2025
Published online: August 19, 2025
Processing time: 158 Days and 5.2 Hours

Abstract
BACKGROUND

Female depression is a prevalent and increasingly recognized mental health issue. Due to cultural and social factors, many female patients still face challenges in diagnosis and treatment, and traditional assessment methods often fail to identify high-risk individuals accurately. This highlights the necessity of developing more precise predictive tools. Utilizing machine learning (ML) algorithms to construct predictive models may overcome the limitations of traditional methods, providing more comprehensive support for women’s mental health.

AIM

To construct an ML-nomogram hybrid model that translates multivariate risk predictors of female depressive symptoms into actionable clinical scoring thresholds, optimizing predictive accuracy and interpretability for healthcare applications.

METHODS

We analyzed data from 7609 female participants aged 18 to 85 years from the Guangdong Provincial Sleep and Psychosomatic Health Survey. Sixteen variables, including anxiety symptoms, insomnia, chronic diseases, exercise habits, and age, were selected based on prior literature and comprehensively incorporated into ML models to maximize predictive information utilization. Three ML algorithms, extreme gradient boosting, support vector machine, and light gradient boosting machine, were employed to construct predictive models. Model performance was evaluated using accuracy, precision, recall, F1 score, and area under the curve (AUC). Feature importance was interpreted using SHapley Additive exPlanations (SHAP), with ablation studies validating the impact of the top five SHAP-derived features on predictive performance, and a nomogram was constructed based on these prioritized predictors. Clinical utility was assessed through decision curve analysis.

RESULTS

The prevalence of depressive symptoms was 6.8% among the sample. The evaluation of predictive models revealed that light gradient boosting machine achieved a top-performing AUC of 0.867, placing it ahead of extreme gradient boosting (AUC = 0.862) and support vector machine (AUC = 0.849). SHAP analysis identified insomnia, anxiety symptoms, age, chronic disease, and exercise as the top five predictors. The nomogram based on these features demonstrated excellent discrimination (AUC = 0.910) and calibration, with significant net benefits in decision curve analysis compared to baseline strategies. The model effectively stratifies depressive symptoms risk, facilitating personalized and quantitative assessments in clinical settings. We also developed an interactive digital version of the nomogram to facilitate its application in clinical practice.

CONCLUSION

The ML-based model effectively predicts depressive symptoms in women, identifying insomnia, anxiety symptoms, age, chronic diseases, and exercise as key predictors, offering a practical tool for early detection and intervention.

Key Words: Depressive symptoms; Women’s mental health; Machine learning; Predictive modeling; SHapley Additive exPlanations; Nomogram; Guangdong Province

Core Tip: This study leverages machine learning to develop a highly accurate predictive model for depressive symptoms in women. Ablation studies systematically validated the critical contributions of these top-ranked SHapley Additive exPlanations features, demonstrating significant performance degradation upon their removal. The light gradient boosting machine model achieved superior performance, supported by SHapley Additive exPlanations for interpretability and a nomogram for clinical application. This innovative approach offers a practical tool for early detection and personalized intervention, addressing the limitations of traditional methods. Findings highlight the potential of machine learning to enhance women’s mental health outcomes, with implications for improving diagnostic precision and treatment strategies in diverse clinical settings.



INTRODUCTION

Depressive disorder, also known as depression, is a leading cause of global burden[1], disproportionately affecting women, with a prevalence of approximately 5% among adults and 6% among women. Female gender is considered a significant risk factor for depression[2-4]; the prevalence of major depressive disorder in women is twice that in men after reaching adulthood[5]. Depression, characterized by persistent depressed mood or loss of pleasure in activities, is associated with elevated risks of premature mortality[6,7], chronic diseases, reduced work capacity, and impaired social functioning[8,9]. Although reproductive health and maternal care have been prioritized to date, women’s healthcare needs extend far beyond the areas prioritized to date[8].

In China, rapid economic development and urbanization over the past three decades have contributed to an increase in depressive symptoms and mood disorders[10,11]. Guangdong province, with its dense population and high economic status, reported a GDP exceeding 14 trillion yuan in 2024. While economic growth has improved living standards, it has also intensified social pressures and family role conflicts, potentially heightening the risk of depression among women[12]. Female patients with depression face challenges in diagnosis and treatment due to cultural and social factors.

Advanced machine learning (ML) algorithms has significantly improved depression prediction by processing more variables and capturing complex nonlinear relationships, enhancing prediction accuracy and early diagnosis. However, ML models face dual implementation barriers in primary care: Limited interpretability of algorithmic outputs and low clinician trust due to unverified decision logic[13]. To address these challenges, recent studies have explored ML-nomogram hybrid approaches, demonstrating significant value in bridging the algorithmic precision of multivariate risk prediction with clinical interpretability. These hybrids transform complex variable interactions into actionable scoring thresholds, facilitating individualized risk stratification and treatment optimization[13]. Studies have shown the effectiveness of such hybrids in predicting depression in various populations, including cross-cultural models for adolescent self-injury[14] and multimodal algorithms integrating biopsychosocial predictors for perinatal depression screening[15].

However, research on ML-based depression prediction, especially among women in China, remains limited. While previous studies have focused on specific subgroups such as postpartum women[15], the elderly[16], and college students[17], there is a paucity of research on general women populations. This study proposed an ML-driven nomogram that translates complex risk factors of depressive symptoms in women into a clinically actionable scoring system, enabling dynamic risk assessment and personalized prevention strategies, using data from the Guangdong Provincial Sleep and Psychosomatic Health Survey (GSPHS). This study adopts rigorous data processing techniques, including feature engineering, as endorsed by recent ML studies on depression prediction[18], ensuring the robustness and interpretability of the predictive model.

MATERIALS AND METHODS
Data source

The GSPHS was a population-based cross-sectional study conducted from September to November 2019 in Guangdong Province, China. It targeted a representative sample of adults aged 18 to 85 years. It was conducted under the guidance and oversight of the Guangdong Mental Health Center. Using a multi-stage sampling method, 17132 adults were randomly selected from 180 communities across 72 districts or counties in 21 prefecture-level cities, with 100 residents chosen from each community. Of these, 13768 individuals responded to the survey, yielding a response rate of 80.4%. The survey methodology for this study is elaborated upon in our previous research[19]. The GSPHS dataset served as the source for the data utilized in the machine learning model. We extracted 7609 female participants from the GSPHS to conduct this study. The study used a complete dataset with no missing values for the variables included. The data processing and model-building processes of ML are displayed in Figure 1.

Figure 1
Figure 1 The flow diagram of data processing and model building process. XGBoost: Extreme gradient boosting; SVM: Support vector machine; LightGBM: Light gradient boosting machine; AUC: Area under the curve.
Measurements

The Patient Health Questionnaire-9 (PHQ-9), containing nine items, was applied to assess depressive symptoms during the last two weeks, which was founded on the major depression standard of Diagnostic and Statistical Manual for Mental Disorders[20-22]. A total score ranges from 0 to 27, with greater values indicating higher levels of depressive symptoms. The Chinese PHQ-9 is universally verified with favorable psychometric properties for screening depression in the general population[23,24]. In this study, the total score of PHQ-9 > 10 was defined as clinical depressive symptoms[20]. Based on previous research[25,26], the outcome was defined as whether the score on the PHQ-9 was 10 or above.

The Generalized Anxiety Disorder-2 (GAD-2) scale demonstrates robust psychometric validity and reliability for screening generalized anxiety symptoms, though its diagnostic accuracy varies across cultural contexts[27,28]. Subjects were inquired how often they were disturbed by the two symptoms of GAD in the last two weeks. The GAD-2 scale assesses two core symptoms of GAD: (1) “Feeling nervous, anxious, or on edge”; and (2) “Not being able to stop or control worrying” over the past two weeks, with each item scored 0-3 based on frequency[29]. Studies in Chinese populations found that at its optimal cutoff, the GAD-2 scale demonstrated strong diagnostic performance, with high sensitivity and specificity, as well as a notably elevated diagnostic odds ratio[30]. In this study, a total score of GAD-2 ≥ 3 was representative of anxiety symptoms. To evaluate sleep quality and disorders over the previous few weeks, the Pittsburgh Sleep Quality Index (PSQI), a 19-item self-report survey, is utilized. PSQI is suitable for clinical psychiatry and sleep quality evaluation studies in China, offering reliability and effectiveness in assessing and screening sleep disorders in the Chinese population[31]. Sleep quality was assessed using the PSQI, with higher scores indicating poorer sleep quality. A total PSQI score > 7 is representative of sleep problems[32]. Additionally, data were collected on body mass index (BMI), chronic diseases, and social-demographic and lifestyle factors, including place of residence, marital status, education level, income class, drink, tea consumption, smoking status, exercise frequency, sleep latency, sleep duration, and sleep efficiency. Detailed measurement protocols for these variables are provided in Supplementary material.

Drawing upon predictors of depressive symptoms identified in prior research[33,34] and accounting for the variables contained within our data source, we selected 16 factors for analysis. These included: Age, residence, BMI, education year, marital status, income class, smoke, drink, tea, exercise, chronic diseases, insomnia, anxiety symptoms, sleep latency, sleep duration, and sleep efficiency. All variables are presented in Table 1.

Table 1 Demographics and baseline characteristics, n (%).
Characteristic
Overall, N = 7609
No depressive symptoms, n = 7090
Depressive symptoms, n = 519
Statistic
P value1
Residence0.480.488
Urban area5933 (78.0)5522 (77.9)411 (79.2)
Rural area1676 (22.0)1568 (22.1)108 (20.8)
Age, years47.52< 0.001
18-395552 (73.0)5106 (72.0)446 (85.9)
40-591901 (25.0)1833 (25.9)68 (13.1)
60-85156 (2.1)151 (2.1)5 (1.0)
Education, years2.780.250
≤ 9815 (10.7)769 (10.8)46 (8.9)
10-154094 (53.8)3818 (53.9)276 (53.2)
≥ 162700 (35.5)2503 (35.3)197 (38.0)
Marital67.87< 0.001
Single/unmarried2431 (31.9)2186 (30.8)245 (47.2)
Married4973 (65.4)4720 (66.6)253 (48.7)
Widowed/divorced/separated205 (2.7)184 (2.6)21 (4.0)
Income class, yuan1.230.541
< 35002935 (38.6)2723 (38.4)212 (40.8)
3500-60003120 (41.0)2916 (41.1)204 (39.3)
> 60001554 (20.4)1451 (20.5)103 (19.8)
Drink, time/week59.46< 0.00
< 17082 (93.1)6642 (93.7)440 (84.8)
≥ 1527 (6.9)448 (6.3)79 (15.2)
Tea, times/week0.320.569
< 44824 (63.4)4501 (63.5)323 (62.2)
≥ 42785 (36.6)2589 (36.5)196 (37.8)
Exercise76.01< 0.001
Hardly exercise2347 (30.8)2103 (29.7)244 (47.0)
1-3 times/month2452 (32.2)2308 (32.6)144 (27.7)
1-2 times/week1426 (18.7)1345 (19.0)81 (15.6)
≥ 3 times/week1384 (18.2)1334 (18.8)50 (9.6)
Chronic disease244.42< 0.001
Yes5745 (75.5)5501 (77.6)244 (47.0)
No1864 (24.5)1589 (22.4)275 (53.0)
BMI14.000.003
18.5-23.94964 (65.2)4636 (65.4)328 (63.2)
< 18.51329 (17.5)1212 (17.1)117 (22.5)
24.0-27.91061 (13.9)1007 (14.2)54 (10.4)
> 28.0255 (3.4)235 (3.3)20 (3.9)
Anxiety symptoms2009.91< 0.001
No7263 (95.5)6973 (98.3)290 (55.9)
Yes346 (4.5)117 (1.7)229 (44.1)
Sleep latency, minutes222.56< 0.001
≤ 152744 (36.1)2647 (37.3)97 (18.7)
16-303687 (48.5)3457 (48.8)230 (44.3)
31-60733 (9.6)625 (8.8)108 (20.8)
> 60445 (5.8)361 (5.1)84 (16.2)
Sleep duration, hours216.64< 0.001
≥ 73014 (39.6)2895 (40.8)119 (22.9)
6-6.993980 (52.3)3705 (52.3)275 (53.0)
5-5.99461 (6.1)370 (5.2)91 (17.5)
< 5154 (2.0)120 (1.7)34 (6.6)
Sleep efficiency, %95.87< 0.001
> 855948 (78.2)5606 (79.1)342 (65.9)
75-841119 (14.7)1030 (14.5)89 (17.1)
65-74319 (4.2)275 (3.9)44 (8.5)
< 65223 (2.9)179 (2.5)44 (8.5)
Smoke92.97< 0.001
No7481 (98.3)6998 (98.7)483 (93.1)
Yes128 (1.7)92 (1.3)36 (6.9)
Insomnia1135.26< 0.001
No6179 (81.2)6047 (85.3)132 (25.4)
Yes1430 (18.8)1043 (14.7)387 (74.6)
Statistical analysis

Data was analyzed using Statistical Package for the social sciences software (version 25.0) and R language (version 4.2.0). Categorical variables are summarized as frequencies and proportions. Categorical variables are expressed as n (%) and compared between groups using the χ2 test. Results were considered significant when P < 0.05 (two-sided). All ML algorithms were operated in Python (version 3.10). The “shinydashboard” and ”DynNom” packages in R were utilized to build an interactive online nomogram, which serves to simplify the clinical use of our prediction model.

ML algorithms

Three separate risk prediction models were constructed using the most popular ML methods commonly applied for classification problems. The analysis was conducted using three ML algorithms: Extreme gradient boosting (XGBoost), support vector machine (SVM), and Light gradient boosting machine (LightGBM), with subsequent accuracy comparisons conducted. These algorithms were selected due to their proven effectiveness in handling complex, high-dimensional categorical data and their ability to capture non-linear relationships and feature interactions, which are critical for accurately predicting depressive symptoms. XGBoost works by building a series of decision trees, where each tree improves upon the predictions of the previous one[35]. It continuously refines its accuracy through this process, making it fast and highly adaptable for handling the large number of variables and data points in this study. Capable of both classification and regression, SVM functions by defining a distinct separation between different classes and utilizes “kernel” methods to handle complex data structures[36]. To solve large problems efficiently, it breaks them down into smaller parts, allowing for faster analysis and better adaptation to new data[37]. Designed for large datasets, LightGBM is an efficient algorithm that trains models quickly, uses minimal memory, and handles categorical variables effectively[38]. It also supports parallel processing, making it ideal for the large-scale data in this study.

Model performance assessment

The systematic comparison of three ML architectures is necessitated by the need to identify the optimal model that balances predictive performance, computational efficiency, and generalizability in clinical decision-making scenarios. Each model’s performance was evaluated using multiple metrics, including accuracy, precision, recall, F1 score, and area under the curve (AUC). The receiver operating characteristic (ROC) curve offers a visual depiction of the trade-offs between a classifier’s true positive rate and false positive rate, thereby serving as a key evaluation tool[39]. It is a unit square plot for simultaneously displaying the tradeoff between the true positive rate (TPR) and the false positive rate (FPR). TPR is the probability that the model incorrectly predicts the positive class. FPR is the probability that the model incorrectly predicts the positive class for a binary classifier at different classification thresholds[39,40]. A model’s classification performance can be quantified by the AUC, which is calculated as the total area beneath the ROC curve. It provides a numerical value for comparing the performance of different models or indicators. A perfectly calibrated model is characterized by a calibration curve that follows the diagonal y = x line, indicating a perfect correspondence between predicted probabilities and actual outcomes[41]. Consequently, any deviation from this diagonal, in terms of the curve’s shape or position, is indicative of the model’s predictive bias and consistency[41]. Decision curve analysis (DCA) evaluates predictive models by calculating net benefit across various threshold probabilities. It plots net benefit against threshold probability, considering clinical consequences[42]. DCA determines the clinical net benefit of a model by benchmarking it against the default strategies of treating all or treating no patients, thereby providing a more clinically relevant assessment than traditional accuracy metrics[43].

Accuracy quantifies the overall proportion of correct predictions made by a model. For binary or multi-class classification tasks, it is calculated by dividing the sum of true positives (TP) and true negatives (TN) by the total number of cases. The formula is expressed as:

Accuracy = (TP + TN)/(TP + FP + TN + FN) × 100%.

where FP and FN represent false positive and false negative, respectively.

Sensitivity (also known as recall or TPR) is calculated by dividing the number of TP by the total number of actual positives (the sum of TP and FN). This metric is vital in contexts like medical screening where failing to identify a true positive case carries high stakes. The formula is:

Sensitivity = TP/(TP + FN) × 100%.

Specificity reflects a model’s ability to correctly identify negative examples. It is calculated as the fraction of all true negative instances that are successfully classified as such. This metric is paramount in situations where the consequences of a false positive, incorrectly classifying a negative case as positive, are particularly high.

Specificity = TN/(TN + FP) × 100%.

The F1 score is calculated as the harmonic mean of precision (positive predictive value, PPV) and recall (sensitivity). This calculation method creates a balanced measure of a test’s performance, which is particularly valuable when these two metrics are uneven. The formula is expressed as:

F1 score = 2 × (PPV × sensitivity)/(PPV + sensitivity).

Dataset division and data balancing

In order to combat overfitting, we randomly divided the sample into training and testing sets in a 7:3 ratio, ensuring the balance of positive and negative samples was preserved. The training set provides the data for the model to learn from, whereas the validation set is used iteratively to optimize hyperparameters and ultimately select the model with superior performance. No oversampling techniques were applied, as preliminary tests with synthetic minority oversampling technique yielded no performance gains, likely due to the categorical nature of the data and the inherent robustness of the classifiers to imbalance[44].

Model validation

For model validation, we used an internal test cohort, separate from the training set, and performed cross-validation to ensure the robustness and generalizability of the models. After applying data normalization, the feature set was refined by removing predictors with very low variance, those with a low correlation to the target label, and those exhibiting strong inter-correlations. This process was undertaken to improve the overall quality of the data for modeling.

In this study, we utilized three ML algorithms: LightGBM, SVM, and XGBoost. To optimize their performance, we employed grid search with cross-validation to tune the hyperparameters for each model. This process involved systematically evaluating various combinations of hyperparameters on the training dataset and selecting the best set based on their average performance across the cross-validation folds. The detailed process of hyperparameter grid optimization, including the specific ranges explored for each model and the selection criteria, is presented in Supplementary material. Using these optimized hyperparameters, we trained the final models on the entire training dataset and evaluated their performance on the independent test set, as presented in the results section.

We utilized a repeated five-fold cross-validation method (5 repeats) on the training data to train and optimize the hyperparameters for three classification algorithms. Each model’s performance was quantified by its average accuracy across the validation folds, and the algorithm with the superior AUC was ultimately chosen as our final predictive model. In this study, we evaluated model performance using only the test set and presented the ROC calibration, and DCA based on this set. While the training set is crucial for model development, the test set is essential for assessing the model’s generalizability and real-world performance. The test set contains data the model has not seen during training, allowing for an unbiased evaluation of its ability to generalize to new and unseen data. Focusing on the test set helps mitigate overfitting, ensuring that the reported performance metrics accurately reflect the model’s expected performance in real-world scenarios.

Model interpretation and ablation study

Model interpretation is crucial for understanding the factors driving model predictions. We utilized SHapley Additive exPlanations (SHAP) for model interpretation, specifically to determine the importance of each feature in predicting the risk of depressive symptoms. Rooted in cooperative game theory, SHAP equitably assigns a Shapley value to each feature[45], representing its marginal contribution to the ML model’s prediction. This is achieved by averaging its impact across all feature permutations, resulting in clear explanations for the model’s output[46].

An ablation study was conducted to determine the impact of the five most influential features, as identified by SHAP, on the predictive performance of the optimal model. This systematic approach quantifies the necessity of key predictors by iteratively removing them, enabling clinically interpretable feature prioritization while maintaining predictive accuracy. The selected model, determined by the highest AUC across LightGBM, XGBoost, and SVM candidates, was retrained iteratively on progressively reduced feature subsets. Specifically, each of the top five SHAP-derived features was sequentially excluded from the original feature set, and the model was retrained using identical hyperparameters and preprocessing steps to ensure comparability. To quantify the impact of individual features, the relative decrease in AUC between the full model and each ablated model was computed, with statistical significance assessed via Delong’s test for paired AUC comparisons. Additionally, a comprehensive ablation was performed by simultaneously excluding all five features to evaluate their collective contribution. All analyses were repeated with 1000 bootstrap resamples to estimate 95% confidence intervals (CIs) for performance metrics, ensuring robustness against sampling variability. This approach not only validated the necessity of the selected features but also provided insights into their clinical interpretability, aligning with the goal of developing a parsimonious, interpretable nomogram for real-world depression risk stratification.

Building nomogram

We constructed a nomogram based on the top five important features to enhance prediction accuracy and interpretability. The selection of the top five SHAP-derived features for nomogram construction optimally balanced predictive accuracy, clinical practicality, and statistical robustness, adhering to the Pareto principle where these features captured the majority of predictive power while avoiding overfitting risks inherent in higher-dimensional models. The nomogram enabled the calculation of the total risk score for individual patients and predicted the probability of depressive symptoms. Each feature factor is represented by an independent scale line in the figure. We performed DCA to assess the clinical utility of the predictive model, evaluating the net benefit of using the model at different threshold probabilities[43].

RESULTS
Baseline characteristics

The study included 7601 participants, with 93.1% (7090) classified as non-depressive and 6.8% (519) exhibiting depressive symptoms (Table 1). The mean age was 47.52 years (± 14.00), showing a significant intergroup difference (P < 0.001). Notably, 73.0% of the total cohort were aged 18-39 years, with a higher proportion in the depressive group (85.9%) compared to the non-depressive group (72.0%, P < 0.001). Geographically, 78.0% resided in urban areas, with minimal difference between groups (79.2% depressive vs 77.9% non-depressive, P = 0.488). The depressive symptoms group had more single/unmarried individuals (47.2% vs 30.8%), higher alcohol consumption (15.2% vs 6.3% consuming ≥ 1 time/week), lower exercise frequency (47.0% vs 29.7% hardly exercising), and higher smoking rates (6.9% vs 1.3%) (all P < 0.001). Clinically, they showed a higher prevalence of underweight BMI (< 18.5 kg/m²; 22.5% vs 17.1%, P = 0.0032), anxiety (62.2% vs 2.5%), insomnia (74.6% vs 14.7%), and poorer sleep quality, with lower sleep efficiency (> 85%; 65.9% vs 79.1%) and fewer achieving ≥ 7 hours of sleep (22.9% vs 40.8%) (all P < 0.001). Notably, chronic diseases were less prevalent in the depressive symptoms group (47.0% vs 77.6%, P < 0.001). Educational attainment and income class were similar between groups.

Comparison of model performance

The calibration of the various models is visually assessed using the calibration curves presented in Figure 2. The calibration curves revealed distinct model behaviors in aligning with the ideal diagonal. LightGBM exhibited the closest overall fit, though systematic underprediction occurred in mid-range probabilities (0.4-0.7), where predicted probabilities corresponded to 0.4-0.5 observed frequencies. XGBoost showed larger deviations at approximately 0.4 and approximately 0.7 probabilities, indicating suboptimal calibration compared to LightGBM. SVM displayed severe miscalibration with erratic fluctuations, notably overpredicting outcomes (observed fraction = 1.0) at approximately 0.5 predicted probabilities, reflecting potential overfitting. All models partially aligned with the diagonal in high-probability ranges (0.8-1.0), with LightGBM converging most smoothly.

Figure 2
Figure 2 The receiver operating characteristic curve of all models. ROC: Receiver operating characteristic; XGBoost: Extreme gradient boosting; SVM: Support vector machine; LightGBM: Light gradient boosting machine; AUC: Area under the curve.

The performance of three ML models, XGBoost, SVM, and LightGBM, was evaluated using the test set, as presented in the ROC curve (Figure 3). The models’ ability to distinguish between the two classes was assessed using the AUC. The testing set ROC curves presented in the right subplot demonstrate robust discriminatory performance across three ML models. LightGBM (blue line) achieved the highest AUC of 0.867, with its trajectory closely paralleling that of XGBoost (green line, AUC = 0.863), indicating comparable classification efficacy between these two ensemble methods. The SVM classifier (orange line) exhibited marginally inferior performance with an AUC of 0.850. The tight clustering of LightGBM and XGBoost trajectories (AUC difference < 0.004) suggests technical parity in generalization capability. The ROC curves for all three models were plotted, exhibiting a consistent trend of high TPR as the FPR increases. The diagonal dashed line represents the performance of a random classifier, which serves as a baseline for comparison. All three models significantly outperformed the random classifier, evidenced by the curves being well above the diagonal, further confirming the robustness of the models. These results suggested that LightGBM demonstrated the best performance among the three algorithms, followed by XGBoost and SVM.

Figure 3
Figure 3 Calibration curve of all models. XGBoost: Extreme gradient boosting; SVM: Support vector machine; LightGBM: Light gradient boosting machine.

A summary of the performance indicators for the three ML models on the test set is provided in Table 2. The assessment included accuracy, precision, recall, F1 score, and AUC. The comparative performance analysis of XGBoost, SVM, and LightGBM models reveals nuanced differences in their predictive capabilities across key evaluation metrics. XGBoost demonstrates balanced performance with an accuracy of 0.941, slightly lower than SVM (0.946) and LightGBM (0.944), though the marginal differences suggest comparable classification reliability among all three models. Notably, SVM achieves the highest precision (0.707), outperforming XGBoost (0.603) and LightGBM (0.670), which aligns with its theoretical strength in maximizing margin separation to reduce false positives. However, XGBoost exhibits superior recall (0.429) compared to both SVM (0.371) and LightGBM (0.378), indicating its relatively stronger ability to identify true positive instances in class-imbalanced scenarios.

Table 2 The performance metrics of three machine learning models.

Accuracy
Precision
Recall
F1 score
AUC
XGBoost0.9410.6030.4290.5010.862
SVM0.9460.7070.3710.4870.849
LightGBM0.9440.6700.3780.4830.867

The F1-score, harmonizing precision and recall, further highlights XGBoost’s advantage (0.501) over SVM (0.487) and LightGBM (0.483), suggesting its effectiveness in balancing type I and type II errors for this task. In terms of discriminative power measured by AUC, LightGBM marginally outperforms others with 0.867, followed by XGBoost (0.862) and SVM (0.849), reflecting LightGBM’s optimized tree-growing strategy and histogram-based acceleration in capturing complex feature interactions. These observations corroborate existing literature where XGBoost’s robustness on smaller datasets and LightGBM’s efficiency in handling feature heterogeneity contribute to their respective strengths. While SVM excels in precision-critical applications, the ensemble methods (XGBoost and LightGBM) show more balanced overall performance, particularly in scenarios requiring trade-off optimization between sensitivity and specificity. In this study, the LightGBM model achieved a high AUC score, indicating its strong reliability in predicting depressive symptoms. Given that accurate risk prediction is crucial for effective intervention, we selected the LightGBM model as the optimal choice to further conduct SHAP-based feature importance analysis.

Feature importance and model interpretation

Feature importance was assessed using SHAP values (Figure 4). In this study, we analyzed the key features influencing the presence of depressive symptoms using SHAP values. The results revealed significant differences in the predictive contributions of various features to depressive symptoms. The figure illustrates the mean absolute SHAP values of these features; blue bars represent the contributions from individuals without depressive symptoms, and purple bars indicate contributions from those with depressive symptoms. The horizontal axis quantifies feature impact magnitude, while vertical axis ranks features by overall importance. Among the features, insomnia was the most critical factor, with significantly higher SHAP values in individuals with depressive symptoms compared to those without, highlighting insomnia as a strong predictor of depressive symptoms. Anxiety symptoms demonstrated a strong association with depressive symptoms. Individuals with depression exhibited markedly higher SHAP values for anxiety symptoms. The SHAP analysis showed wider distribution of mean absolute SHAP values for age in non-depressive (blue bar) vs depressive groups (purple bar). Chronic diseases showed balanced contributions across groups. Exercise emerged as a key predictor, with non-depressive individuals displaying higher mean absolute SHAP values, suggesting stronger protective effects of habitual exercise in non-clinical populations.

Figure 4
Figure 4 Feature contributions by class using SHapley Additive exPlanations values. LightGBM: Light gradient boosting machine; SHAP: SHapley Additive exPlanations.

Based on the SHAP values, we identified the top five variables with the highest impact on the outcome, which were insomnia, anxiety, age, chronic disease, and exercise. To systematically evaluate the contribution of each feature to model performance, we conducted an ablation study comparing the complete model (with all features) against models with individual features removed (insomnia, anxiety, age, chronic disease, exercise), as illustrated in Figure 5. The baseline model (all features) achieved an AUC of 0.875. Experimental results demonstrated: Removing the age feature increased AUC to 0.884 (ΔAUC = +0.009), while removing exercise elevated AUC to 0.881 (ΔAUC = +0.006), suggesting potential redundancy or noise introduction from these features. Elimination of chronic disease decreased AUC to 0.869 (ΔAUC = -0.006), confirming its marginal positive contribution. Notably, removing insomnia caused a dramatic AUC decline to 0.804 (ΔAUC = -0.071), and anxiety removal reduced AUC to 0.850 (ΔAUC = -0.025), establishing both as critical predictors, with insomnia exhibiting 2.8-fold greater impact than anxiety. This graduated performance variation reveals asymmetric feature importance distribution.

Figure 5
Figure 5 Ablation study on key features using the receiver operating characteristic curve. LightGBM: Light gradient boosting machine; ROC: Receiver operating characteristic; AUC: Area under the curve.

A model was established for predicting the risk of depressive symptoms in women using a nomogram, as illustrated in Figure 6. The model incorporates multiple predictive variables to calculate a total risk score based on the points assigned to each variable. Scoring can be performed for each patient according to these established risk factors; a higher total score indicates a greater likelihood of depressive symptoms. We have developed an online digital version of this nomogram, which is accessible viaSupplementary material.

Figure 6
Figure 6 Nomogram for predicting the risk of depression based on insomnia, anxiety symptoms, chronic disease, exercise, and age. The total points are used to calculate the linear predictor and risk probability.

ROC curves for the training and internal test cohorts are given in Figure 7, evaluating the model’s performance. The X-axis represents the FPR (1-specificity), while the Y-axis represents the TPR (sensitivity). The ROC analysis revealed excellent model discrimination, with the internal test cohort achieving an AUC of 0.910 (95%CI: 0.885-0.935), closely matching the training cohort’s 0.906 (0.889-0.922). Both cohorts’ curves showed near-identical trajectories tightly clustered near the upper-left quadrant, consistently outperforming the random-chance diagonal across the full FPR spectrum (0-1.00). Overlapping CIs and parallel curve progression confirmed minimal performance divergence, indicating strong generalization without overfitting. At clinically actionable thresholds, the model maintained a sensitivity > 0.75 while controlling FPR < 0.25 in both cohorts. These findings demonstrate robust predictive consistency for depression risk stratification in unseen validation data, supporting clinical translation potential.

Figure 7
Figure 7 The receiver operating characteristic curves for the predictive model in the training and internal test cohorts. AUC: Area under the curve.
Clinical utility and DCA

The clinical utility of the model was evaluated using DCA, which assessed the net benefit of the model at various probability thresholds for predicting depressive symptoms in women. The calibration plot for the logistic regression model (Figure 8) demonstrates moderate miscalibration with systematic underprediction across low-to-moderate risk ranges. Between 25% and 75% predicted risk (X-axis), the observed risk curve lies below the ideal calibration diagonal while exhibiting pronounced local fluctuations. This systematic deviation peaks in the mid-range probabilities, suggesting conservative risk estimation for intermediate-risk subgroups. Improved alignment emerges in higher risk strata (75%-100% predicted risk), where the curve closely follows the diagonal despite a transient downward deflection near 85% predicted risk. Notably, the model achieves near-perfect calibration convergence in the extreme upper decile (90%-100% predicted risk), with minimal residual fluctuations closely tracking the diagonal. The preserved discriminative capacity is evidenced by an AUC of 91.0 (95%CI: 88.5-93.5), complemented by a Brier score of 4.3 (95%CI: 3.7-5.0), indicating both strong predictive accuracy and acceptable calibration error magnitude. These patterns collectively reveal context-dependent calibration performance, with particular clinical relevance in high-risk stratification where prediction-reality concordance becomes critical for intervention decisions.

Figure 8
Figure 8 Calibration plot for the logistic regression model. The plot shows the relationship between predicted and observed risks. AUC: Area under the curve.

The DCA (Figure 9) demonstrates the clinical superiority of the predictive model over blanket treatment strategies across most risk thresholds. The model (red line) achieves sustained net benefit advantages compared to the “treat all” approach (grey) from a threshold of 0.3 onward, maintaining a gradual descending trajectory until 0.9 before steeply declining while remaining above the “treat all” line until the extreme threshold limit (0.95-1.0). In contrast, the “treat all” strategy shows rapid net benefit deterioration beyond 0.7, ultimately intersecting the X-axis at 0.95, while the model never crosses this null-benefit threshold. The model’s peak clinical utility occurs between thresholds 0.4-0.6, corresponding to cost: Benefit ratios of 2:3 to 3:2, where it delivers maximum net benefit values (0.6-0.8) for personalized intervention decisions. Notably, the model outperforms both extremes (“treat none” black line at zero benefit and “treat all”) across 90% of the threshold spectrum (0.3-1.0), proving particularly advantageous in moderate-to-high risk scenarios where clinical risk tolerance balances between 0.4 and 0.8. The preserved performance gap at extreme thresholds (> 0.9) reflects robust calibration in high-risk predictions, while the early threshold advantage (0.3-0.4) suggests utility in preventive care settings. This comprehensive profile confirms the model’s adaptability to diverse clinical risk preferences while avoiding the overtreatment pitfalls of universal strategies.

Figure 9
Figure 9 Decision curve analysis for the predictive model. The plot shows the net benefit of the model across varying thresholds of high risk, compared to strategies of treating all patients or treating none.
Key findings

Population characteristics: Depressive groups showed higher proportions of younger adults (18-39 years) and unmarried individuals, with elevated unhealthy behaviors (smoking, alcohol use) and poorer sleep quality. Paradoxically, chronic disease prevalence was lower in this group, potentially linked to their younger demographic. Model performance: LightGBM outperformed XGBoost and SVM in predictive accuracy (AUC = 0.867), demonstrating superior calibration and balance between precision and recall. Its mid-range probability predictions aligned more closely with observed outcomes than other models. Predictive determinants: Insomnia and anxiety emerged as dominant predictors, with SHAP analysis revealing insomnia’s disproportionate influence compared to other factors. Chronic diseases contributed minimally, suggesting sample selection bias may obscure their role. Clinical utility: The model achieved robust discrimination in high-risk stratification and outperformed blanket intervention strategies in DCA. Its calibration near extreme-risk thresholds supports precision in clinical decision-making. Limitations: Age-related confounders in chronic disease patterns and conservative mid-risk calibration highlight areas for refinement. While SHAP clarified feature importance, causal pathways remain unverified.

DISCUSSION

Recent research on ML approaches for female depression prediction has demonstrated evolving multidisciplinary frameworks. Studies have integrated electronic health records with psychological assessments to develop composite risk models, enhancing identification of high-risk populations[47]. Biomarker discovery through proteomic analysis has revealed potential blood-based indicators linked to postpartum depression pathophysiology, with ML validating their predictive relevance[48]. Multicenter investigations employing various algorithms emphasized the effectiveness of combining biological, psychological, and social predictors, particularly highlighting postnatal factors’ critical role in model optimization[15]. Cross-cultural examinations revealed how family dynamics and regional caregiving patterns influence depression risk stratification in female populations, with ensemble learning methods showing particular efficacy in culturally contextualized prediction[14]. These developments collectively advance personalized intervention strategies while underscoring the necessity of incorporating sociocultural dimensions into computational prediction frameworks for women’s mental health. The SHAP-driven selection of top 5 features coupled with nomogram visualization aligns with emerging trends in translational bioinformatics, where model transparency enhances clinical utility while maintaining competitive performance comparable to complex ensembles[49,50]. Despite existing research employing ML methods for depression prediction, studies specifically targeting the female population remain scarce. Developing models particularly for females addresses the unique impacts of depression on women and provides clinicians with more personalized and precise diagnostic and therapeutic support.

In this study, the prevalence of moderate and above depressive symptoms was 6.8%, close to the results of previous studies[51,52]. The observed lower prevalence of chronic diseases in the depressive group may reflect age-related selection bias, as the cohort predominantly comprised younger individuals (85.9% aged 18-39) who were less likely to develop age-dependent conditions like cardiovascular or metabolic disorders[53]. This paradox also stems from cross-sectional study limitations, where younger depressive populations were captured before accumulating comorbidities linked to aging trajectories.

Our study systematically evaluated their performance, filling the research gap in parallel comparisons of these models. We ensured high external validity and generalizability by employing GSPHS data, which includes a diverse population of women across various ages, health statuses, and social backgrounds. Innovations in data preprocessing, including feature selection and data balancing, further enhanced model performance.

A large body of literature exists on the application of ML models for predicting depression; yet, a predominant focus on single-model evaluations is evident in many of these studies[54,55]. To predict depressive symptoms in women, we compared and evaluated three mainstream ML methods: XGBoost, SVM, and LightGBM. After evaluating multiple ML models, we selected the optimal model based on the AUC. AUC is particularly suitable for medical classification tasks given its robustness to class imbalance[56]. In classification tasks within medicine and psychology, AUC is widely adopted due to its robustness to class imbalance. Selecting this model indicated that it performed best in ranking positive instances higher than negative ones, which is critical for clinical applications[57].

The outstanding performance of LightGBM, as evidenced by its higher AUC compared to XGBoost and SVM, could be attributed to its efficient handling of high-dimensional categorical features, adaptability to sparse data patterns, and unique algorithmic design for capturing complex feature interactions. Our dataset consisted exclusively of categorical variables spanning demographic characteristics, behavioral factors, and clinical variables. A key advantage lay in LightGBM’s native support for categorical features without one-hot encoding, avoiding sparse matrix issues. For example, variables like “exercise frequency” were processed via optimal splitting, while SVM required binary encoding that inflated dimensionality. LightGBM further optimized categorical splits using specialized parameters, enhancing performance. The prediction of depressive symptoms often involves complex interactions among multiple variables[58,59]. For example, marital status and income level may jointly influence depression risk[60], while sleep quality and anxiety levels may exhibit interaction effects[61]. LightGBM’s leaf-wise tree growth and hierarchical partitioning effectively captured variable interactions, a characteristic that explains its superior performance. In medical and psychological data, features often exhibit non-linear relationships with the target variable. For example, BMI may show a U-shaped relationship with depressive symptoms, where both underweight and overweight individuals are at higher risk[62]. LightGBM demonstrates enhanced capability in identifying complex nonlinear patterns through its gradient-boosting framework and tree-based architecture[63]. This algorithmic advantage typically yields superior performance in AUC metrics compared to linear models or suboptimally configured SVM implementations. This characteristic likely explains LightGBM’s outperformance in our dataset, particularly when analyzing data containing such intricate pattern structures. With low depression prevalence (< 10%), AUC robustly assessed imbalanced data performance[64]. LightGBM addressed class imbalance via parameters like scale_pos_weight and leveraged histogram-based learning for computational efficiency. In contrast, SVM struggled with high dimensionality from one-hot encoding. These findings underscore LightGBM’s suitability for medical data with complex patterns. Future studies should employ feature importance analysis to elucidate critical interactions and enhance interpretability.

We applied SHAP to identify the top five predictive variables, improving the clinical interpretability of these “black-box” algorithms. To better understand the contribution of each factor and refine the model accordingly, we incorporated the top five predictive variables into the ablation study. The ablation study results revealed deeper insights into the relative importance of the top five SHAP-ranked features in predicting depressive symptoms. The removal of the insomnia feature led to a significant decline in AUC, underscoring its critical role in predicting depressive symptoms. Similarly, excluding the anxiety feature resulted in reduced AUC, confirming its importance as a predictive factor. In contrast, eliminating the chronic disease feature caused only a marginal AUC decrease, suggesting its relatively minor contribution to model performance. This may reflect either an indirect relationship between chronic diseases and depression or partial mediation of its effects through other features. Interestingly, removing age and physical activity features resulted in slight AUC improvements, a counterintuitive finding. Although SHAP values identified these features as important, their exclusion may indicate that they introduce noise or overfitting in the model. The improved performance without these features could imply better generalization or potential redundancy, where their predictive effects are already captured by other variables. This phenomenon warrants further investigation to explore whether age and physical activity contribute redundant signals or noisy patterns in depression prediction.

These variables, insomnia, anxiety symptoms, chronic disease, exercise, and age, were used to construct a nomogram. The nomogram demonstrated excellent discrimination, calibration, and clinical utility, providing clinicians with a quantifiable tool for depression risk stratification. DCA confirmed its superior net benefit compared to “treat-all” and “treat-none” strategies. The identified predictors align with established roles in depression pathogenesis and offer practical insights for risk assessment. Our model’s strong performance highlights its potential as a clinical screening instrument, supporting personalized risk assessment and early intervention. This data-driven approach could enhance screening protocols, improve evidence-based decision-making, and boost patient outcomes via targeted prevention strategies.

Insomnia emerged as the top variable influencing the model’s predictive performance. Poor sleep quality or sleep disorders are commonly observed in individuals with depression, often co-occurring with a higher prevalence of chronic illnesses, indicating worse overall health in this population. The relationship between sleep and depression is complex and bidirectional. Insomnia and sleep disturbances exacerbate depressive symptoms by increasing stress, impairing emotional regulation, and disrupting neurobiological pathways, involving the hypothalamic-pituitary-adrenal (HPA) axis. Depression itself can lead to sleep dysfunction, creating a vicious cycle that perpetuates both conditions. Improved sleep quality, including more days per week of restful sleep and reduced sleep dysfunction, is associated with decreased depression[65-68]. Accordingly, it is critical to address sleep disturbances as a key component in managing depression, specifically in women. Women are more likely to experience insomnia due to hormonal fluctuations, caregiving responsibilities, and higher stress levels; their vulnerability to depression may be further heightened by poor sleep.

The second most significant variable identified for anxiety symptoms was consistent with previous literature, indicating that these factors are crucial in the onset and progression of depressive symptoms[69]. Our study found that individuals with anxiety demonstrated a higher risk of experiencing symptoms of depression, supporting their frequent comorbidity and potential shared mechanisms, with anxiety being China’s most prevalent lifetime disorder[70]. Major depressive disorder is associated with anxiety disorders, particularly panic disorder, GAD and post-traumatic stress disorder[71,72], with emerging evidence highlighting shared neurobiological mechanisms. Both disorders involve dysregulation of the serotonin (5-HT) and norepinephrine systems, as well as HPA axis hyperactivity[73-75]. Reduced 5-HT function in the prefrontal cortex and limbic system is linked to heightened anxiety, emotional reactivity, and impaired stress coping[74,76], while elevated norepinephrine amplifies arousal and vigilance[73]. Chronic stress drives HPA axis hyperactivity, elevating cortisol levels and causing hippocampal atrophy, prefrontal cortex dysfunction, and amygdala hyperactivity, key contributors to emotional dysregulation and symptom persistence[77-79]. Cortisol further disrupts 5-HT/norepinephrine balance, exacerbating pathophysiology[80]. The bidirectional relationship between anxiety and depression underscores their interdependence, necessitating integrated prevention and treatment[72]. Sexual dimorphism in stress responses (e.g., women’s “tend and befriend” pattern) and sex hormones’ regulatory effects on HPA axis activity may heighten women’s vulnerability to anxiety[73], emphasizing the need for sex-specific approaches in research and clinical practice.

Age emerged as a significant predictor of depressive symptoms, with the depression group exhibiting a younger mean age in this study. This finding may reflect the influence of modern social stressors, which disproportionately impact younger individuals, including academic or career pressures, financial instability, and social isolation[81]. Additionally, life stage transitions, such as entering the workforce, building relationships, or starting families, may heighten vulnerability to depression during early adulthood. Hormonal fluctuations, particularly in women, may further contribute to this risk[82]. Younger women face unique stressors, like societal expectations and role balancing. These findings accentuated the need for targeted interventions and further research into age-related social, biological, and psychological factors to guide mental health strategies.

Chronic disease was a key predictor of depressive symptoms in our model, representing a close association between physical and mental health. Chronic activation of the HPA axis commonly observed in individuals with chronic diseases, is linked to the development and progression of major depressive disorder and various physical conditions, such as cardiovascular disease[73,79]. The persistent physiological stress associated with chronic illness, including inflammation, hormonal dysregulation, and immune dysfunction, may induce or exacerbate depressive symptoms. Addressing chronic disease as part of depression management is therefore critical.

In this study, we found that women who exercise four or more times a month demonstrated less likelihood of experiencing depressive symptoms. Future research should identify the most effective strategies to break the cycle of chronic disease and depression to develop more personalized and holistic treatment approaches. Exercise stimulates dopamine and endorphin production, elevates mood, and improves depression and anxiety across diverse populations, including those with chronic diseases[83-85]. Exercise reduces inflammation, improves sleep, and enhances cognitive function, supporting better mental health[86]. These effects are particularly relevant for depression, where biological, psychological, and social factors interact to worsen symptoms. Given its multifaceted benefits, incorporating exercise into depression treatment strategies is essential. Future research should explore the optimal exercise types, intensities, and frequencies to maximize its therapeutic potential.

Nomograms are predictive tools based on statistical models that integrate multiple factors to estimate the probability of a specific clinical event[42]. The SHAP-derived predictors could enable healthcare providers to stratify populations based on individualized risk profiles, directing screening efforts toward subgroups with elevated biomarker scores or socioeconomic vulnerabilities. In this study, a nomogram was constructed to predict the probability of clinical depressive symptoms in patients. This tool not only demonstrated robust predictive performance and substantial positive net benefits but also provides a practical way for healthcare providers to prioritize screening efforts. Healthcare providers could use the nomogram to stratify patients based on risk levels, directing screening and follow-up resources toward those with elevated scores to optimize care delivery and improve patient outcomes. By incorporating these variables into nomograms, providers may efficiently identify high-risk individuals during routine visits, particularly in resource-limited settings where comprehensive mental health evaluations are challenging.

There are several limitations in this study. The sample predominantly consisted of women, which, while valuable for understanding gender-specific aspects of depression, limits generalizability. Geographic sampling biases and limited population diversity require validation across broader socioeconomic/regional contexts. The reliance on self-reported data for smoking, drinking, and anxiety symptoms may introduce bias, particularly regarding sensitive topics. More objective data collection methods, such as wearable devices, could help mitigate this limitation. This cross-sectional study lacks specific analysis of depression during pregnancy and perimenopause periods, potentially reducing model accuracy for these high-risk groups. While the LightGBM model exhibited strong predictive performance, potential overfitting concerns necessitate further validation in independent samples. The complex interactions between sleep quality, marital status, and anxiety symptoms in depression onset require further investigation through longitudinal cohort studies to understand causal relationships better. Future research should employ more detailed variable categorization and broader population sampling to enhance result generalizability.

CONCLUSION

This study successfully developed a robust predictive model for the risk of depressive symptoms in women using ML algorithms, including XGBoost, SVM and LightGBM. The model demonstrated excellent discriminative ability and calibration, displaying its potential for application in clinical practice. Insomnia, anxiety symptoms, age, chronic disease and exercise were identified as key predictors of depressive symptoms in women. These findings highlight that clinical and public health strategies must target these specific factors to enable the early detection of at-risk individuals and the delivery of personalized interventions for women’s mental health. The integration of SHAP enhanced model interpretability, and interactive digital version of nomogram provided a practical tool for clinical application. Future research will address the current limitations to further refine these predictive models.

ACKNOWLEDGEMENTS

We sincerely thank all participants and researchers of the Guangdong Provincial Sleep and Psychosomatic Health Survey for their contributions, and we would like to acknowledge everyone who helped us in this study.

Footnotes

Provenance and peer review: Unsolicited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Psychiatry

Country of origin: China

Peer-review report’s classification

Scientific Quality: Grade C, Grade C, Grade D

Novelty: Grade B, Grade C, Grade D

Creativity or Innovation: Grade B, Grade C, Grade C

Scientific Significance: Grade B, Grade C, Grade D

P-Reviewer: Tasci B; Xian XB S-Editor: Wu S L-Editor: A P-Editor: Wang WB

References
1.  GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Psychiatry. 2022;9:137-150.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 436]  [Cited by in RCA: 2683]  [Article Influence: 894.3]  [Reference Citation Analysis (0)]
2.  Balakrishnan V, Ng KS, Kaur W, Govaichelvan K, Lee ZL. COVID-19 depression and its risk factors in Asia Pacific - A systematic review and meta-analysis. J Affect Disord. 2022;298:47-56.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 15]  [Cited by in RCA: 42]  [Article Influence: 14.0]  [Reference Citation Analysis (0)]
3.  Hidayati I, Tan W, Yamu C. How gender differences and perceptions of safety shape urban mobility in Southeast Asia. Transp Res Part F Traffic Psychol Behav. 2020;73:155-173.  [PubMed]  [DOI]  [Full Text]
4.  Glenister KM, Ervin K, Podubinski T. Detrimental Health Behaviour Changes among Females Living in Rural Areas during the COVID-19 Pandemic. Int J Environ Res Public Health. 2021;18:722.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 11]  [Cited by in RCA: 21]  [Article Influence: 5.3]  [Reference Citation Analysis (0)]
5.  Albert PR. Why is depression more prevalent in women? J Psychiatry Neurosci. 2015;40:219-221.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 752]  [Cited by in RCA: 991]  [Article Influence: 99.1]  [Reference Citation Analysis (0)]
6.  Malhi GS, Mann JJ. Depression. Lancet. 2018;392:2299-2312.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 1255]  [Cited by in RCA: 2504]  [Article Influence: 357.7]  [Reference Citation Analysis (0)]
7.  Correll CU, Solmi M, Veronese N, Bortolato B, Rosson S, Santonastaso P, Thapa-Chhetri N, Fornaro M, Gallicchio D, Collantoni E, Pigato G, Favaro A, Monaco F, Kohler C, Vancampfort D, Ward PB, Gaughran F, Carvalho AF, Stubbs B. Prevalence, incidence and mortality from cardiovascular disease in patients with pooled and specific severe mental illness: a large-scale meta-analysis of 3,211,768 patients and 113,383,368 controls. World Psychiatry. 2017;16:163-180.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 966]  [Cited by in RCA: 1139]  [Article Influence: 142.4]  [Reference Citation Analysis (0)]
8.  Patwardhan V, Gil GF, Arrieta A, Cagney J, DeGraw E, Herbert ME, Khalil M, Mullany EC, O'Connell EM, Spencer CN, Stein C, Valikhanova A, Gakidou E, Flor LS. Differences across the lifespan between females and males in the top 20 causes of disease burden globally: a systematic analysis of the Global Burden of Disease Study 2021. Lancet Public Health. 2024;9:e282-e294.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 25]  [Cited by in RCA: 69]  [Article Influence: 69.0]  [Reference Citation Analysis (0)]
9.  Lima S, Sousa N, Patrício P, Pinto L. The underestimated sex: A review on female animal models of depression. Neurosci Biobehav Rev. 2022;133:104498.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 16]  [Reference Citation Analysis (0)]
10.  Harpham T. Urbanization and mental health in developing countries: a research role for social scientists, public health professionals and social psychiatrists. Soc Sci Med. 1994;39:233-245.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 101]  [Cited by in RCA: 81]  [Article Influence: 2.6]  [Reference Citation Analysis (0)]
11.  Ochnik D, Buława B, Nagel P, Gachowski M, Budziński M. Urbanization, loneliness and mental health model - A cross-sectional network analysis with a representative sample. Sci Rep. 2024;14:24974.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 1]  [Cited by in RCA: 2]  [Article Influence: 2.0]  [Reference Citation Analysis (0)]
12.  He H, Wang X, Yan CY, Jiao J. [Demographic factors and medianisms affecting the mental health of Chinese residents and mechanisms: Empirical study based on provincial panel data]. Zhongguo Weisheng Zhengce Yanjiu. 2024;17:56-63.  [PubMed]  [DOI]  [Full Text]
13.  Peng Y, Wu T, Chen Z, Deng Z. Value Cocreation in Health Care: Systematic Review. J Med Internet Res. 2022;24:e33061.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 4]  [Cited by in RCA: 16]  [Article Influence: 5.3]  [Reference Citation Analysis (0)]
14.  Niu B, Wan M, Zhou Y. Development of an explainable machine learning model for predicting depression in adolescent girls with non-suicidal self-injury: A cross-sectional multicenter study. J Affect Disord. 2025;379:690-702.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 1]  [Reference Citation Analysis (0)]
15.  Qi W, Wang Y, Wang Y, Huang S, Li C, Jin H, Zuo J, Cui X, Wei Z, Guo Q, Hu J. Prediction of postpartum depression in women: development and validation of multiple machine learning models. J Transl Med. 2025;23:291.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 2]  [Reference Citation Analysis (0)]
16.  Su D, Zhang X, He K, Chen Y. Use of machine learning approach to predict depression in the elderly in China: A longitudinal study. J Affect Disord. 2021;282:289-298.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 26]  [Cited by in RCA: 66]  [Article Influence: 16.5]  [Reference Citation Analysis (0)]
17.  Luo L, Yuan J, Wu C, Wang Y, Zhu R, Xu H, Zhang L, Zhang Z. Predictors of depression among Chinese college students: a machine learning approach. BMC Public Health. 2025;25:470.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 4]  [Reference Citation Analysis (0)]
18.  Nickson D, Meyer C, Walasek L, Toro C. Prediction and diagnosis of depression using machine learning with electronic health records data: a systematic review. BMC Med Inform Decis Mak. 2023;23:271.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 17]  [Reference Citation Analysis (0)]
19.  Xu WQ, Tan WY, Li XL, Huang ZH, Zheng HR, Hou CL, Jia FJ, Wang SB. Prevalence and correlates of depressive and anxiety symptoms among adults in Guangdong Province of China: A population-based study. J Affect Disord. 2022;308:535-544.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 2]  [Cited by in RCA: 19]  [Article Influence: 6.3]  [Reference Citation Analysis (0)]
20.  Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606-613.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 21545]  [Cited by in RCA: 28759]  [Article Influence: 1198.3]  [Reference Citation Analysis (0)]
21.  Accuracy of Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: individual participant data meta-analysis. BMJ. 2019;365:l1781.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 5]  [Cited by in RCA: 20]  [Article Influence: 3.3]  [Reference Citation Analysis (0)]
22.  Negeri ZF, Levis B, Sun Y, He C, Krishnan A, Wu Y, Bhandari PM, Neupane D, Brehaut E, Benedetti A, Thombs BD; Depression Screening Data (DEPRESSD) PHQ Group. Accuracy of the Patient Health Questionnaire-9 for screening to detect major depression: updated systematic review and individual participant data meta-analysis. BMJ. 2021;375:n2183.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 23]  [Cited by in RCA: 146]  [Article Influence: 36.5]  [Reference Citation Analysis (0)]
23.  Wang W, Bian Q, Zhao Y, Li X, Wang W, Du J, Zhang G, Zhou Q, Zhao M. Reliability and validity of the Chinese version of the Patient Health Questionnaire (PHQ-9) in the general population. Gen Hosp Psychiatry. 2014;36:539-544.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 481]  [Cited by in RCA: 863]  [Article Influence: 78.5]  [Reference Citation Analysis (0)]
24.  Chen S, Chiu H, Xu B, Ma Y, Jin T, Wu M, Conwell Y. Reliability and validity of the PHQ-9 for screening late-life depression in Chinese primary care. Int J Geriatr Psychiatry. 2010;25:1127-1133.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 125]  [Cited by in RCA: 158]  [Article Influence: 10.5]  [Reference Citation Analysis (0)]
25.  Han C, Jo SA, Kwak JH, Pae CU, Steffens D, Jo I, Park MH. Validation of the Patient Health Questionnaire-9 Korean version in the elderly population: the Ansan Geriatric study. Compr Psychiatry. 2008;49:218-223.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 269]  [Cited by in RCA: 315]  [Article Influence: 18.5]  [Reference Citation Analysis (0)]
26.  van Ballegooijen W, Ruwaard J, Karyotaki E, Ebert DD, Smit JH, Riper H. Reactivity to smartphone-based ecological momentary assessment of depressive symptoms (MoodMonitor): protocol of a randomised controlled trial. BMC Psychiatry. 2016;16:359.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 27]  [Cited by in RCA: 21]  [Article Influence: 2.3]  [Reference Citation Analysis (0)]
27.  Byrd-Bredbenner C, Eck K, Quick V. GAD-7, GAD-2, and GAD-mini: Psychometric properties and norms of university students in the United States. Gen Hosp Psychiatry. 2021;69:61-66.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 20]  [Cited by in RCA: 72]  [Article Influence: 18.0]  [Reference Citation Analysis (0)]
28.  Kroenke K, Spitzer RL, Williams JB, Monahan PO, Löwe B. Anxiety disorders in primary care: prevalence, impairment, comorbidity, and detection. Ann Intern Med. 2007;146:317-325.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 2390]  [Cited by in RCA: 2954]  [Article Influence: 164.1]  [Reference Citation Analysis (0)]
29.  Nath S, Ryan EG, Trevillion K, Bick D, Demilew J, Milgrom J, Pickles A, Howard LM. Prevalence and identification of anxiety disorders in pregnancy: the diagnostic accuracy of the two-item Generalised Anxiety Disorder scale (GAD-2). BMJ Open. 2018;8:e023766.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 52]  [Cited by in RCA: 52]  [Article Influence: 7.4]  [Reference Citation Analysis (0)]
30.  Luo Z, Li Y, Hou Y, Zhang H, Liu X, Qian X, Jiang J, Wang Y, Liu X, Dong X, Qiao D, Wang F, Wang C. Adaptation of the two-item generalized anxiety disorder scale (GAD-2) to Chinese rural population: A validation study and meta-analysis. Gen Hosp Psychiatry. 2019;60:50-56.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 32]  [Cited by in RCA: 70]  [Article Influence: 11.7]  [Reference Citation Analysis (0)]
31.  Zhang C, Zhang H, Zhao M, Li Z, Cook CE, Buysse DJ, Zhao Y, Yao Y. Reliability, Validity, and Factor Structure of Pittsburgh Sleep Quality Index in Community-Based Centenarians. Front Psychiatry. 2020;11:573530.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 87]  [Cited by in RCA: 74]  [Article Influence: 14.8]  [Reference Citation Analysis (1)]
32.  Wu W, Jiang Y, Wang N, Zhu M, Liu X, Jiang F, Zhao G, Zhao Q. Sleep quality of Shanghai residents: population-based cross-sectional study. Qual Life Res. 2020;29:1055-1064.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 12]  [Cited by in RCA: 27]  [Article Influence: 4.5]  [Reference Citation Analysis (0)]
33.  Frank P, Batty GD, Pentti J, Jokela M, Poole L, Ervasti J, Vahtera J, Lewis G, Steptoe A, Kivimäki M. Association Between Depression and Physical Conditions Requiring Hospitalization. JAMA Psychiatry. 2023;80:690-699.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 7]  [Cited by in RCA: 44]  [Article Influence: 22.0]  [Reference Citation Analysis (0)]
34.  Gold SM, Köhler-Forsberg O, Moss-Morris R, Mehnert A, Miranda JJ, Bullinger M, Steptoe A, Whooley MA, Otte C. Comorbid depression in medical diseases. Nat Rev Dis Primers. 2020;6:69.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 127]  [Cited by in RCA: 291]  [Article Influence: 58.2]  [Reference Citation Analysis (0)]
35.  Hueniken K, Somé NH, Abdelhack M, Taylor G, Elton Marshall T, Wickens CM, Hamilton HA, Wells S, Felsky D. Machine Learning-Based Predictive Modeling of Anxiety and Depressive Symptoms During 8 Months of the COVID-19 Global Pandemic: Repeated Cross-sectional Survey Study. JMIR Ment Health. 2021;8:e32876.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 15]  [Cited by in RCA: 13]  [Article Influence: 3.3]  [Reference Citation Analysis (0)]
36.  Charalambous A, Dodlek N. Big Data, Machine Learning, and Artificial Intelligence to Advance Cancer Care: Opportunities and Challenges. Semin Oncol Nurs. 2023;39:151429.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 2]  [Cited by in RCA: 4]  [Article Influence: 2.0]  [Reference Citation Analysis (0)]
37.  Dong JX, Krzyzak A, Suen CY. Fast SVM training algorithm with decomposition on very large data sets. IEEE Trans Pattern Anal Mach Intell. 2005;27:603-618.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 147]  [Cited by in RCA: 49]  [Article Influence: 2.5]  [Reference Citation Analysis (0)]
38.  Zhang J, Mucs D, Norinder U, Svensson F. LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity-Application to the Tox21 and Mutagenicity Data Sets. J Chem Inf Model. 2019;59:4150-4158.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 72]  [Cited by in RCA: 120]  [Article Influence: 20.0]  [Reference Citation Analysis (0)]
39.  Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29-36.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 13773]  [Cited by in RCA: 12279]  [Article Influence: 285.6]  [Reference Citation Analysis (0)]
40.  Gocoglu A, Demirel N, Bozdogan H. A Novel Information Complexity Approach to Score Receiver Operating Characteristic (ROC) Curve Modeling. Entropy (Basel). 2024;26:988.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 1]  [Reference Citation Analysis (0)]
41.  Alba AC, Agoritsas T, Walsh M, Hanna S, Iorio A, Devereaux PJ, McGinn T, Guyatt G. Discrimination and Calibration of Clinical Prediction Models: Users' Guides to the Medical Literature. JAMA. 2017;318:1377-1384.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 632]  [Cited by in RCA: 1041]  [Article Influence: 130.1]  [Reference Citation Analysis (1)]
42.  Iremashvili V, Manoharan M, Pelaez L, Rosenberg DL, Soloway MS. Clinically significant Gleason sum upgrade: external validation and head-to-head comparison of the existing nomograms. Cancer. 2012;118:378-385.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 14]  [Cited by in RCA: 14]  [Article Influence: 1.0]  [Reference Citation Analysis (0)]
43.  Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565-574.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 3515]  [Cited by in RCA: 3468]  [Article Influence: 182.5]  [Reference Citation Analysis (1)]
44.  Elor Y, Averbuch-Elor H.   To SMOTE, or not to SMOTE? 2022 Preprint. Available from: arXiv:2201.08528.  [PubMed]  [DOI]  [Full Text]
45.  Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee SI. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat Mach Intell. 2020;2:56-67.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 2808]  [Cited by in RCA: 2229]  [Article Influence: 445.8]  [Reference Citation Analysis (0)]
46.  Zheng SS, Guo WQ, Lu H, Si QS, Liu BH, Wang HZ, Zhao Q, Jia WR, Yu TP. Machine learning approaches to predict the apparent rate constants for aqueous organic compounds by ferrate. J Environ Manage. 2023;329:116904.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 6]  [Reference Citation Analysis (0)]
47.  Amit G, Girshovitz I, Marcus K, Zhang Y, Pathak J, Bar V, Akiva P. Estimation of postpartum depression risk from electronic health records using machine learning. BMC Pregnancy Childbirth. 2021;21:630.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 19]  [Cited by in RCA: 33]  [Article Influence: 8.3]  [Reference Citation Analysis (0)]
48.  Wang S, Xu R, Li G, Liu S, Zhu J, Gao P. A Plasma Proteomics-Based Model for Identifying the Risk of Postpartum Depression Using Machine Learning. J Proteome Res. 2025;24:824-833.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 1]  [Cited by in RCA: 2]  [Article Influence: 2.0]  [Reference Citation Analysis (0)]
49.  Yang C, Liu Z, Fang Y, Cao X, Xu G, Wang Z, Hu Z, Wang S, Wu X. Development and validation of a clinic machine-learning nomogram for the prediction of risk stratifications of prostate cancer based on functional subsets of peripheral lymphocyte. J Transl Med. 2023;21:465.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 9]  [Reference Citation Analysis (0)]
50.  Hasannejadasl H, Osong B, Bermejo I, van der Poel H, Vanneste B, van Roermund J, Aben K, Zhang Z, Kiemeney L, Van Oort I, Verwey R, Hochstenbach L, Bloemen E, Dekker A, Fijten RRR. A comparison of machine learning models for predicting urinary incontinence in men with localized prostate cancer. Front Oncol. 2023;13:1168219.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 3]  [Cited by in RCA: 6]  [Article Influence: 3.0]  [Reference Citation Analysis (0)]
51.  Ge L, Liu J, Kang X, Wang W, Zhang D. Association of serum individual and mixed aldehydes with depressive symptoms in the general population: A machine learning study. J Affect Disord. 2024;345:8-17.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 1]  [Reference Citation Analysis (0)]
52.  Olfati M, Samea F, Faghihroohi S, Balajoo SM, Küppers V, Genon S, Patil K, Eickhoff SB, Tahmasian M. Prediction of depressive symptoms severity based on sleep quality, anxiety, and gray matter volume: a generalizable machine learning approach across three datasets. EBioMedicine. 2024;108:105313.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 8]  [Reference Citation Analysis (0)]
53.  Huang J, Xu T, Dai Y, Li Y, Tu R. Age-related differences in the number of chronic diseases in association with trajectories of depressive symptoms: a population-based cohort study. BMC Public Health. 2024;24:2496.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 11]  [Reference Citation Analysis (0)]
54.  Li E, Ai F, Liang C. A machine learning model to predict the risk of depression in US adults with obstructive sleep apnea hypopnea syndrome: a cross-sectional study. Front Public Health. 2023;11:1348803.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 7]  [Reference Citation Analysis (0)]
55.  Zhou Y, Han W, Yao X, Xue J, Li Z, Li Y. Developing a machine learning model for detecting depression, anxiety, and apathy in older adults with mild cognitive impairment using speech and facial expressions: A cross-sectional observational study. Int J Nurs Stud. 2023;146:104562.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 16]  [Reference Citation Analysis (0)]
56.  Hao R, Namdar K, Liu L, Khalvati F. A Transfer Learning-Based Active Learning Framework for Brain Tumor Classification. Front Artif Intell. 2021;4:635766.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 34]  [Cited by in RCA: 26]  [Article Influence: 6.5]  [Reference Citation Analysis (0)]
57.  Kleppe A. Area under the curve may hide poor generalisation to external datasets. ESMO Open. 2022;7:100429.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 10]  [Reference Citation Analysis (0)]
58.  Lorenzo-Luaces L, Rodriguez-Quintana N, Riley TN, Weisz JR. A placebo prognostic index (PI) as a moderator of outcomes in the treatment of adolescent depression: Could it inform risk-stratification in treatment with cognitive-behavioral therapy, fluoxetine, or their combination? Psychother Res. 2021;31:5-18.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 2]  [Cited by in RCA: 8]  [Article Influence: 1.6]  [Reference Citation Analysis (0)]
59.  Moffa G, Catone G, Kuipers J, Kuipers E, Freeman D, Marwaha S, Lennox BR, Broome MR, Bebbington P. Using Directed Acyclic Graphs in Epidemiological Research in Psychosis: An Analysis of the Role of Bullying in Psychosis. Schizophr Bull. 2017;43:1273-1279.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 65]  [Cited by in RCA: 83]  [Article Influence: 10.4]  [Reference Citation Analysis (0)]
60.  Hinata A, Kabasawa K, Watanabe Y, Kitamura K, Ito Y, Takachi R, Tsugane S, Tanaka J, Sasaki A, Narita I, Nakamura K. Education, household income, and depressive symptoms in middle-aged and older Japanese adults. BMC Public Health. 2021;21:2120.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 3]  [Cited by in RCA: 28]  [Article Influence: 7.0]  [Reference Citation Analysis (0)]
61.  Asiamah N, Cronin C, Abbott JE, Smith S. Interactions of depression, anxiety, and sleep quality with menopausal symptoms on job satisfaction among middle-aged health workers in England: a STROBE-based analysis. Hum Resour Health. 2024;22:64.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 2]  [Reference Citation Analysis (0)]
62.  He K, Pang T, Huang H. The relationship between depressive symptoms and BMI: 2005-2018 NHANES data. J Affect Disord. 2022;313:151-157.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 1]  [Cited by in RCA: 39]  [Article Influence: 13.0]  [Reference Citation Analysis (0)]
63.  Mahayana D. Data-Driven LightGBM Controller for Robotic Manipulator. IEEE Access. 2024;12:40883-40893.  [PubMed]  [DOI]  [Full Text]
64.  Zuo D, Yang L, Jin Y, Qi H, Liu Y, Ren L. Machine learning-based models for the prediction of breast cancer recurrence risk. BMC Med Inform Decis Mak. 2023;23:276.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 41]  [Reference Citation Analysis (0)]
65.  Lin LH, Xu WQ, Wang SB, Hu Q, Zhang P, Huang JH, Ke YF, Ding KR, Hou CL, Jia FJ. U-shaped association between sleep duration and subjective cognitive complaints in Chinese elderly: a cross-sectional study. BMC Psychiatry. 2022;22:147.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 5]  [Cited by in RCA: 9]  [Article Influence: 3.0]  [Reference Citation Analysis (0)]
66.  Hu Q, Song Y, Wang S, Lin L, Ke Y, Zhang P. Association of subjective cognitive complaints with poor sleep quality: A cross-sectional study among Chinese elderly. Int J Geriatr Psychiatry. 2023;38:e5956.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 6]  [Cited by in RCA: 8]  [Article Influence: 4.0]  [Reference Citation Analysis (0)]
67.  Beisecker L, Harrison P, Josephson M, DeFreese JD. Depression, anxiety and stress among female student-athletes: a systematic review and meta-analysis. Br J Sports Med. 2024;58:278-285.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 3]  [Reference Citation Analysis (0)]
68.  Benjamin CL, Curtis RM, Huggins RA, Sekiguchi Y, Jain RK, McFadden BA, Casa DJ. Sleep Dysfunction and Mood in Collegiate Soccer Athletes. Sports Health. 2020;12:234-240.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 11]  [Cited by in RCA: 24]  [Article Influence: 4.8]  [Reference Citation Analysis (0)]
69.  Jacobson NC, Newman MG. Anxiety and depression as bidirectional risk factors for one another: A meta-analysis of longitudinal studies. Psychol Bull. 2017;143:1155-1200.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 175]  [Cited by in RCA: 294]  [Article Influence: 36.8]  [Reference Citation Analysis (0)]
70.  Huang Y, Wang Y, Wang H, Liu Z, Yu X, Yan J, Yu Y, Kou C, Xu X, Lu J, Wang Z, He S, Xu Y, He Y, Li T, Guo W, Tian H, Xu G, Xu X, Ma Y, Wang L, Wang L, Yan Y, Wang B, Xiao S, Zhou L, Li L, Tan L, Zhang T, Ma C, Li Q, Ding H, Geng H, Jia F, Shi J, Wang S, Zhang N, Du X, Du X, Wu Y. Prevalence of mental disorders in China: a cross-sectional epidemiological study. Lancet Psychiatry. 2019;6:211-224.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 1590]  [Cited by in RCA: 1352]  [Article Influence: 225.3]  [Reference Citation Analysis (0)]
71.  Ji X, Ivers H, Beaulieu-Bonneau S, Morin CM. Complementary and alternative treatments for insomnia/insomnia -depression-anxiety symptom cluster: Meta-analysis of English and Chinese literature. Sleep Med Rev. 2021;58:101445.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 2]  [Cited by in RCA: 19]  [Article Influence: 4.8]  [Reference Citation Analysis (0)]
72.  Robson EM, Husin HM, Ghazaleh Dashti S, Vijayakumar N, Moreno-Betancur M, Moran P, Patton GC, Sawyer SM. Tracking the course of depressive and anxiety symptoms across adolescence (the CATS study): a population-based cohort study in Australia. Lancet Psychiatry. 2025;12:44-53.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 6]  [Reference Citation Analysis (0)]
73.  Lee JH, Meyer EJ, Nenke MA, Lightman SL, Torpy DJ. Cortisol, Stress, and Disease-Bidirectional Associations; Role for Corticosteroid-Binding Globulin? J Clin Endocrinol Metab. 2024;109:2161-2172.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 2]  [Reference Citation Analysis (0)]
74.  Perez-Caballero L, Torres-Sanchez S, Romero-López-Alberca C, González-Saiz F, Mico JA, Berrocoso E. Monoaminergic system and depression. Cell Tissue Res. 2019;377:107-113.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 56]  [Cited by in RCA: 130]  [Article Influence: 21.7]  [Reference Citation Analysis (0)]
75.  Li K, Wei W, Xu C, Lian X, Bao J, Yang S, Wang S, Zhang X, Zheng X, Wang Y, Zhong S. Prebiotic inulin alleviates anxiety and depression-like behavior in alcohol withdrawal mice by modulating the gut microbiota and 5-HT metabolism. Phytomedicine. 2024;135:156181.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 6]  [Reference Citation Analysis (0)]
76.  Li D, Liang W, Zhang W, Huang Z, Liang H, Liu Q. Fecal microbiota transplantation repairs intestinal permeability and regulates the expression of 5-HT to influence alcohol-induced depression-like behaviors in C57BL/6J mice. Front Microbiol. 2023;14:1241309.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 13]  [Reference Citation Analysis (0)]
77.  Herman JP, McKlveen JM, Ghosal S, Kopp B, Wulsin A, Makinson R, Scheimann J, Myers B. Regulation of the Hypothalamic-Pituitary-Adrenocortical Stress Response. Compr Physiol. 2016;6:603-621.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 663]  [Cited by in RCA: 1158]  [Article Influence: 128.7]  [Reference Citation Analysis (0)]
78.  Dziurkowska E, Wesolowski M. Cortisol as a Biomarker of Mental Disorder Severity. J Clin Med. 2021;10:5204.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 7]  [Cited by in RCA: 95]  [Article Influence: 23.8]  [Reference Citation Analysis (0)]
79.  Lin YT, Liu TY, Yang CY, Yu YL, Chen TC, Day YJ, Chang CC, Huang GJ, Chen JC. Chronic activation of NPFFR2 stimulates the stress-related depressive behaviors through HPA axis modulation. Psychoneuroendocrinology. 2016;71:73-85.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 20]  [Cited by in RCA: 28]  [Article Influence: 3.1]  [Reference Citation Analysis (0)]
80.  Conrad CD. Chronic stress-induced hippocampal vulnerability: the glucocorticoid vulnerability hypothesis. Rev Neurosci. 2008;19:395-411.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 325]  [Cited by in RCA: 299]  [Article Influence: 17.6]  [Reference Citation Analysis (0)]
81.  Pagliusi-Jr M, Amorim-Marques AP, Lobo MK, Guimarães FS, Lisboa SF, Gomes FV. The rostral ventromedial medulla modulates pain and depression-related behaviors caused by social stress. Pain. 2024;165:1814-1823.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 1]  [Cited by in RCA: 1]  [Article Influence: 1.0]  [Reference Citation Analysis (0)]
82.  Barth C, Crestol A, de Lange AG, Galea LAM. Sex steroids and the female brain across the lifespan: insights into risk of depression and Alzheimer's disease. Lancet Diabetes Endocrinol. 2023;11:926-941.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 22]  [Cited by in RCA: 43]  [Article Influence: 21.5]  [Reference Citation Analysis (0)]
83.  Yuan K, Zheng YB, Wang YJ, Sun YK, Gong YM, Huang YT, Chen X, Liu XX, Zhong Y, Su SZ, Gao N, Lu YL, Wang Z, Liu WJ, Que JY, Yang YB, Zhang AY, Jing MN, Yuan CW, Zeng N, Vitiello MV, Patel V, Fazel S, Minas H, Thornicroft G, Fan TT, Lin X, Yan W, Shi L, Shi J, Kosten T, Bao YP, Lu L. A systematic review and meta-analysis on prevalence of and risk factors associated with depression, anxiety and insomnia in infectious diseases, including COVID-19: a call to action. Mol Psychiatry. 2022;27:3214-3222.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 12]  [Cited by in RCA: 112]  [Article Influence: 37.3]  [Reference Citation Analysis (0)]
84.  Paolucci EM, Loukov D, Bowdish DME, Heisz JJ. Exercise reduces depression and inflammation but intensity matters. Biol Psychol. 2018;133:79-84.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 142]  [Cited by in RCA: 257]  [Article Influence: 36.7]  [Reference Citation Analysis (0)]
85.  Zhang XY, Ye F, Yin ZH, Li YQ, Bao QN, Xia MZ, Chen ZH, Zhong WQ, Wu KX, Yao J, Liang FR. Research status and trends of physical activity on depression or anxiety: a bibliometric analysis. Front Neurosci. 2024;18:1337739.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 3]  [Reference Citation Analysis (0)]
86.  Langan-Evans C, Hearris MA, Gallagher C, Long S, Thomas C, Moss AD, Cheung W, Howatson G, Morton JP. Nutritional Modulation of Sleep Latency, Duration, and Efficiency: A Randomized, Repeated-Measures, Double-Blind Deception Study. Med Sci Sports Exerc. 2023;55:289-300.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 13]  [Reference Citation Analysis (0)]