Papers
Topics
Authors
Recent
Search
2000 character limit reached

CCD-CBT: Multi-Agent Therapeutic Interaction for CBT Guided by Cognitive Conceptualization Diagram

Published 8 Apr 2026 in cs.CL | (2604.06551v1)

Abstract: LLMs show potential for scalable mental-health support by simulating Cognitive Behavioral Therapy (CBT) counselors. However, existing methods often rely on static cognitive profiles and omniscient single-agent simulation, failing to capture the dynamic, information-asymmetric nature of real therapy. We introduce CCD-CBT, a multi-agent framework that shifts CBT simulation along two axes: 1) from a static to a dynamically reconstructed Cognitive Conceptualization Diagram (CCD), updated by a dedicated Control Agent, and 2) from omniscient to information-asymmetric interaction, where the Therapist Agent must reason from inferred client states. We release CCDCHAT, a synthetic multi-turn CBT dataset generated under this framework. Evaluations with clinical scales and expert therapists show that models fine-tuned on CCDCHAT outperform strong baselines in both counseling fidelity and positive-affect enhancement, with ablations confirming the necessity of dynamic CCD guidance and asymmetric agent design. Our work offers a new paradigm for building theory-grounded, clinically-plausible conversational agents.

Summary

  • The paper introduces a multi-agent framework that dynamically reconstructs cognitive conceptualization diagrams to simulate clinically authentic CBT sessions.
  • It leverages the CCDChat dataset of 4,500 curated sessions and demonstrates superior performance in CTRS and PANAS evaluations.
  • The control agent coordinates phased therapy, enforcing information asymmetry to mimic real-world therapeutic reasoning.

Multi-Agent Cognitive Behavioral Therapy Simulation via Dynamic Cognitive Conceptualization Diagrams

Motivation and Paradigm Shift

The CCD-CBT framework addresses two foundational limitations prevalent in prior LLM-based CBT simulation approaches: reliance on static cognitive profiles and the use of omniscient (single-agent) simulation. Most previous models predetermine and fix a client's cognitive-affective schema for the entirety of the therapeutic session, rendering the dialogue a mere execution of a static script. Furthermore, omniscient designs allow the model to access both the client and therapist's internal states, producing unrealistically successful and clinically implausible interactions. CCD-CBT innovatively introduces a multi-agent architecture, separating roles into Client, Therapist, and Control Agents. This enables dynamic reconstruction of the client’s Cognitive Conceptualization Diagram (CCD) and enforces information asymmetry, requiring the Therapist Agent to operate solely from inferred states, emulating real-world therapeutic reasoning. Figure 1

Figure 1: Comparison of CBT simulation paradigms, contrasting early prompt-based methods, static cognitive modeling, and the dynamic multi-agent framework of CCD-CBT.

Framework Architecture and Agent Separation

CCD-CBT models psychotherapy as a structured, multi-phase process, aligning with clinical CBT best-practices. The framework contains three distinct LLM-driven agents:

  • Client Agent: Simulates the therapy-seeking individual using a ground-truth CCD, modulating for attitudinal stance (positive, neutral, negative), ensuring engagement realism and stance-dependent variation.
  • Therapist Agent: Generates responses based on a dynamically reconstructed CCD as provided by the Control Agent, following phase-specific strategies and maintaining information asymmetry from the client’s internal states.
  • Control Agent: Orchestrates session progression through Identification, Assessment, Intervention, and Summary phases. It actively updates the therapist-accessible CCD and phase state tracker, autonomously managing strategic planning and phase transitions. Figure 2

    Figure 2: Overview of CCD-CBT framework showing Client, Therapist, and Control Agents interacting according to the CCD and four therapeutic phases.

This structure yields adaptive, multi-turn dialogues where the therapist must infer and reason about the client’s psychological status rather than relying on omniscient access.

Dataset Construction: CCDChat

CCD-CBT enables the creation of CCDChat, a synthetic multi-turn CBT dataset with 4,500 curated sessions. Each session is grounded in a distinct CCD, derived from clinically validated C2D2 resources, covering broad situational contexts (e.g., family, work, health). The dataset enforces phased therapeutic completeness and integrates realistic attitudinal variations. Figure 3

Figure 3: Distribution of situational contexts within CCD-CBT and CCDChat, demonstrating coverage across seven primary life domains.

This resource is characterized by explicit CBT grounding, end-to-end CCD guidance, and consistently structured, extended dialogues, marking a departure from template-based and rule-driven prior datasets.

Evaluation and Numerical Results

CCD-CBT models were fine-tuned using LoRA on the CCDChat corpus, with separate adapters for each agent and phase. Evaluation incorporates both automatic (via CTRS and PANAS scales using GPT-4o-mini) and expert-based metrics.

CTRS (Clinical Competence):

CCD-CBT achieves the highest scores across five of six dimensions, notably outperforming baselines in CBT-specific Strategy and Guided Discovery by >0.2 points. The model consistently excels in procedural competence metrics, confirming the necessity of dynamic CCD guidance for structured intervention.

PANAS (Emotional Outcomes):

CCD-CBT induces the most pronounced increase in positive affect and largest decrease in negative affect across all client attitudes, especially under challenging neutral and negative stances.

Ablation Studies:

Eliminating CCD guidance or fine-tuning results in significant drops in CTRS metrics, particularly Strategy and Focus, underscoring the additive contributions of CCDChat and dynamic CCD planning. Figure 4

Figure 4

Figure 4: Human evaluation results showing CCDChat dialogue preference over CACTUS and PsyDTCorpus baselines across Helpfulness, Coherence, Empathy, and Guidance.

Human Evaluation:

Licensed CBT practitioners rate CCDChat dialogues consistently superior to both CACTUS and PsyDTCorpus across all clinical dimensions (p<0.05p < 0.05), confirming empirical authenticity and therapeutic effectiveness.

Component Analysis and Dataset Quality

The Control Agent’s dynamic CCD reconstruction achieves high fidelity, with mean alignment scores (2.77/3), especially for core beliefs and coping strategies. Attitudinal and belief content analyses reveal robustness and minimal performance variance, validating the model’s resilience and adaptability to diverse patient schemas.

CCDChat guarantees clinical phase completeness (Identification, Assessment, Intervention, Summary). Baseline datasets omit critical stages (e.g., “Assessment” and “Homework Assignment”), leading to degraded professional fidelity. Quantitative CTRS scoring across datasets corroborates CCDChat’s superiority in therapeutic skill dimensions. Figure 5

Figure 5: Components and relationships within the Cognitive Conceptualization Diagram, central to therapy session structuring in CCD-CBT.

Case Study and Process Walkthrough

A full-case walkthrough demonstrates the framework’s phased approach: initial situation and automatic thoughts elicit emotions and behaviors; therapist uses Socratic questioning and downward arrow techniques to probe core beliefs and relevant history; structured assessment quantifies belief/emotion intensities; intervention phase guides the client through alternative thought generation and behavioral experiment design; summary phase consolidates practice and affirms progress.

Practical and Theoretical Implications

CCD-CBT exemplifies an advance in simulating clinically authentic psychotherapy sessions by enforcing information asymmetry and dynamic cognitive profiling. Practically, it provides a scalable mechanism for training and evaluating digital counselors and can augment mental-health support for populations with limited therapist access. Theoretically, the framework demonstrates the necessity of modeling agent-specific knowledge and interactional inference for aligning conversational AI with real-world therapeutic constraints. The CCDChat dataset establishes a high-fidelity benchmark for future research in theory-grounded clinical dialogue and agent collaboration.

Future Directions and Limitations

Future extensions may encompass multi-session modeling, incorporating non-verbal and multimodal signals, and broadening CCD diversity for cross-cultural generalizability. The reliance on synthetic, Chinese-language data introduces demographic bias, and the single-session paradigm omits longitudinal therapeutic dynamics—addressing these limitations is crucial for further deployment and fairness.

Conclusion

CCD-CBT delivers a multi-agent, dynamically guided simulation of CBT that aligns with clinical realities better than previous static and omniscient paradigms. Models trained on CCDChat demonstrate superior therapeutic competence and affective regulation, validated through both automatic and expert human evaluation. The research provides both a novel computational framework and a clinically grounded dataset, advancing the realism and efficacy of digital mental-health agents (2604.06551).

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.