Published online Aug 19, 2025. doi: 10.5498/wjp.v15.i8.106622
Revised: April 17, 2025
Accepted: June 25, 2025
Published online: August 19, 2025
Processing time: 158 Days and 5.9 Hours
Female depression is a prevalent and increasingly recognized mental health issue. Due to cultural and social factors, many female patients still face challenges in diagnosis and treatment, and traditional assessment methods often fail to identify high-risk individuals accurately. This highlights the necessity of developing more precise predictive tools. Utilizing machine learning (ML) algorithms to construct predictive models may overcome the limitations of traditional methods, provi
To construct an ML-nomogram hybrid model that translates multivariate risk pre
We analyzed data from 7609 female participants aged 18 to 85 years from the Guangdong Provincial Sleep and Psychosomatic Health Survey. Sixteen variables, including anxiety symptoms, insomnia, chronic diseases, exercise habits, and age, were selected based on prior literature and comprehensively incorporated into ML models to maximize predictive information utilization. Three ML algorithms, extreme gradient boosting, support vector machine, and light gradient boosting machine, were employed to construct predictive models. Model performance was evaluated using accuracy, precision, recall, F1 score, and area under the curve (AUC). Feature importance was interpreted using SHapley Additive exPlanations (SHAP), with ablation studies validating the impact of the top five SHAP-derived features on predictive performance, and a nomogram was constructed based on these prioritized predictors. Clinical utility was assessed through decision curve analysis.
The prevalence of depressive symptoms was 6.8% among the sample. The evaluation of predictive models revealed that light gradient boosting machine achieved a top-performing AUC of 0.867, placing it ahead of extreme gradient boosting (AUC = 0.862) and support vector machine (AUC = 0.849). SHAP analysis identified insomnia, anxiety symptoms, age, chronic disease, and exercise as the top five predictors. The nomogram based on these features demonstrated excellent discrimination (AUC = 0.910) and calibration, with significant net benefits in decision curve analysis compared to baseline strategies. The model effectively stratifies depressive symptoms risk, facilitating personalized and quantitative assessments in clinical settings. We also developed an interactive digital version of the nomogram to facilitate its application in clinical practice.
The ML-based model effectively predicts depressive symptoms in women, identifying insomnia, anxiety symp
Core Tip: This study leverages machine learning to develop a highly accurate predictive model for depressive symptoms in women. Ablation studies systematically validated the critical contributions of these top-ranked SHapley Additive exPlanations features, demonstrating significant performance degradation upon their removal. The light gradient boosting machine model achieved superior performance, supported by SHapley Additive exPlanations for interpretability and a nomogram for clinical application. This innovative approach offers a practical tool for early detection and personalized intervention, addressing the limitations of traditional methods. Findings highlight the potential of machine learning to enhance women’s mental health outcomes, with implications for improving diagnostic precision and treatment strategies in diverse clinical settings.