Observational Study
Copyright ©The Author(s) 2025. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Psychiatry. Aug 19, 2025; 15(8): 106622
Published online Aug 19, 2025. doi: 10.5498/wjp.v15.i8.106622
Machine learning-based nomogram for predicting depressive symptoms in women: A cross-sectional study in Guangdong Province, China
Jia-Min Chen, Mei Rao, Yu-Ting Wei, Qiong-Gui Zhou, Jun-Long Tao, Shi-Bin Wang, Bo Bi
Jia-Min Chen, Yu-Ting Wei, Qiong-Gui Zhou, Jun-Long Tao, Bo Bi, School of Public Health, Hainan Medical University, Hainan Academy of Medical Science, Haikou 571199, Hainan Province, China
Mei Rao, Department of Pharmacy, Longyan First Hospital Affiliated to Fujian Medical University, Longyan 364000, Fujian Province, China
Shi-Bin Wang, Guangdong Mental Health Center, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510000, Guangdong Province, China
Co-first authors: Jia-Min Chen and Mei Rao.
Co-corresponding authors: Shi-Bin Wang and Bo Bi.
Author contributions: Chen JM and Rao M collected the data, wrote the first version of manuscript, and made equal contributions as co-first authors; Wei YT, Zhou QG, and Tao JL contributed to the data treat and analysis; Wang SB and Bi B designed the study, supervised the study, revised the paper, and made equal contributions as co-corresponding authors. All authors agreed to publish the manuscript.
Supported by Longyan City Science and Technology Plan Project, No. 2024 LYF17067.
Institutional review board statement: The study protocol was approved by the Research Ethics Committee of the Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, No. GDREC2018543H(R1).
Informed consent statement: All participants provided written informed consent prior to the survey.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
STROBE statement: The authors have read the STROBE Statement-checklist of items, and the manuscript was prepared and revised according to the STROBE Statement-checklist of items.
Data sharing statement: Data referenced in this study are available in the Guangdong Provincial Sleep and Psychosomatic Health Survey database. The datasets analyzed during the current study are available from the corresponding author on reasonable request. The complete data are not publicly available due to them containing information that could compromise research participant privacy or consent. The code related to this study has been uploaded to GitHub: https://github.com/KMccn/Machine-Learning-Based-Nomogram-for-Predicting-Depressive-Symptoms.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Bo Bi, School of Public Health, Hainan Medical University, Xueyuan Road, Longhua District, Haikou 571199, Hainan Province, China. bibo@hainmc.edu.cn
Received: March 4, 2025
Revised: April 17, 2025
Accepted: June 25, 2025
Published online: August 19, 2025
Processing time: 158 Days and 5.9 Hours
Abstract
BACKGROUND

Female depression is a prevalent and increasingly recognized mental health issue. Due to cultural and social factors, many female patients still face challenges in diagnosis and treatment, and traditional assessment methods often fail to identify high-risk individuals accurately. This highlights the necessity of developing more precise predictive tools. Utilizing machine learning (ML) algorithms to construct predictive models may overcome the limitations of traditional methods, providing more comprehensive support for women’s mental health.

AIM

To construct an ML-nomogram hybrid model that translates multivariate risk predictors of female depressive symptoms into actionable clinical scoring thresholds, optimizing predictive accuracy and interpretability for healthcare applications.

METHODS

We analyzed data from 7609 female participants aged 18 to 85 years from the Guangdong Provincial Sleep and Psychosomatic Health Survey. Sixteen variables, including anxiety symptoms, insomnia, chronic diseases, exercise habits, and age, were selected based on prior literature and comprehensively incorporated into ML models to maximize predictive information utilization. Three ML algorithms, extreme gradient boosting, support vector machine, and light gradient boosting machine, were employed to construct predictive models. Model performance was evaluated using accuracy, precision, recall, F1 score, and area under the curve (AUC). Feature importance was interpreted using SHapley Additive exPlanations (SHAP), with ablation studies validating the impact of the top five SHAP-derived features on predictive performance, and a nomogram was constructed based on these prioritized predictors. Clinical utility was assessed through decision curve analysis.

RESULTS

The prevalence of depressive symptoms was 6.8% among the sample. The evaluation of predictive models revealed that light gradient boosting machine achieved a top-performing AUC of 0.867, placing it ahead of extreme gradient boosting (AUC = 0.862) and support vector machine (AUC = 0.849). SHAP analysis identified insomnia, anxiety symptoms, age, chronic disease, and exercise as the top five predictors. The nomogram based on these features demonstrated excellent discrimination (AUC = 0.910) and calibration, with significant net benefits in decision curve analysis compared to baseline strategies. The model effectively stratifies depressive symptoms risk, facilitating personalized and quantitative assessments in clinical settings. We also developed an interactive digital version of the nomogram to facilitate its application in clinical practice.

CONCLUSION

The ML-based model effectively predicts depressive symptoms in women, identifying insomnia, anxiety symptoms, age, chronic diseases, and exercise as key predictors, offering a practical tool for early detection and intervention.

Keywords: Depressive symptoms; Women’s mental health; Machine learning; Predictive modeling; SHapley Additive exPlanations; Nomogram; Guangdong Province

Core Tip: This study leverages machine learning to develop a highly accurate predictive model for depressive symptoms in women. Ablation studies systematically validated the critical contributions of these top-ranked SHapley Additive exPlanations features, demonstrating significant performance degradation upon their removal. The light gradient boosting machine model achieved superior performance, supported by SHapley Additive exPlanations for interpretability and a nomogram for clinical application. This innovative approach offers a practical tool for early detection and personalized intervention, addressing the limitations of traditional methods. Findings highlight the potential of machine learning to enhance women’s mental health outcomes, with implications for improving diagnostic precision and treatment strategies in diverse clinical settings.