Introduction. In educational contexts, adolescent psychological assessment is essential for supporting students' mental health and development, yet conventional self-report and standardized measures remain time-consuming and labor-intensive. Recent work has explored large language models (LLMs) for generating psychological assessment reports from personal and behavioral data. However, the "black-box" nature of LLMs undermines interpretability and reliability in rigorous psychological settings.
Purpose. To address this critical issue, this study proposes a Neural-Symbolic dual-path fusion framework for generating efficient and explainable psychological assessments. This design preserves the generative flexibility of LLMs while ensuring theoretical coherence and methodological rigor.
Method. The framework integrates two complementary reasoning pathways mirroring human dual cognition: a neural inference path using LLMs for intuitive association, and a symbolic reasoning path based on knowledge-graph logic. The neural path captures each student's unique behavioral patterns (e.g., biting a pen when feeling anxious), while the symbolic path follows psychological variable relations (e.g., ). A consistency module cross-validates both outputs, ensuring mutual reinforcement and calibrated reliability.
Results. To systematically assess the effectiveness of the proposed framework, a dataset of 300 synthetic student profiles was constructed, each containing brief emotional and academic records. Psychological assessment reports were generated using three comparative approaches: (1) a neural-only model based on LLM inference, (2) a symbolic-only model utilizing knowledge-graph reasoning, and (3) the proposed Neural-Symbolic dual-path framework. The resulting reports were evaluated along six dimensions—readability, depth of insight, logical coherence, informational accuracy, credibility, and interpretability—by both expert psychologists and independent llm evaluators. Results show that the Neural-Symbolic framework significantly outperforms the other two approaches (F(2, 297)=89.41, p<0.001), particularly in insight depth, logical consistency, and interpretability.
Conclusions. These findings not only demonstrate its potential to generate assessments that are efficient, trustworthy, but also establish a robust, practical paradigm for creating explainable AI applications in psychological evaluation.