Introduction
Adolescent depression is a critical public health challenge often underdiagnosed due to resource limitations and stigma. Interpretable machine learning (ML) models offer a promising approach for scalable screening and early risk detection.
Purpose
We developed and validated machine learning models for immediate depression screening and for prospective prediction of depression risk among adolescents, using large multi-site survey datasets from China. This work addresses the need for efficient, accurate identification of at-risk youth.
Method
Data were collected at baseline (T1, N=7406) with a follow-up 6 months later (T2, N=2730) and an independent external cohort (N=6015). We applied LASSO regression for feature selection, identifying 12 key questionnaire items. Multiple algorithms (random forest, XGBoost, support vector machine, and logistic regression) were trained to classify current depression and predict future episodes. A 12-item random forest model was chosen for its optimal performance and interpretability. SHAP (Shapley Additive Explanations) values were used to interpret the model's predictions, highlighting the most influential features contributing to depression risk.
Results
The 12-item random forest screening model achieved excellent discrimination of current depression (AUC ~0.94) with high sensitivity and specificity. It also maintained high performance in prospective validation, predicting depression at one-year follow-up (AUC ~0.78). In an external validation cohort, screening performance remained robust, and longer-term prediction achieved AUC ~0.68 for later depression onset. The models demonstrated strong generalizability across datasets, and their SHAP-based explanations enhanced transparency by identifying consistent risk factors.
Conclusion
Our findings highlight that interpretable ML-based screening and prediction tools can accurately identify adolescents at risk for depression. These scalable, transparent models could facilitate early intervention and inform global mental health strategies, supporting the goal of improved adolescent well-being.