Exploring Expert Perspectives on Wearable-Triggered LLM Conversational Support for Daily Stress Management

Published 6 Apr 2026 in cs.HC | (2604.04915v1)

Abstract: Wearable devices increasingly support stress detection, while LLMs enable conversational mental health support. However, designing systems that meaningfully connect wearable-triggered stress events with generative dialogue remains underexplored, particularly from a design perspective. We present EmBot, a functional mobile application that combines wearable-triggered stress detection with LLM-based conversational support for daily stress management. We used EmBot as a design probe in semi-structured interviews with 15 mental health experts to examine their perspectives and surface early design tensions and considerations that arise from wearable-triggered conversational support, informing the future design of such systems for daily stress management and mental health support.

Abstract PDF Upgrade to Chat

Authors (6)

Summary

The paper identifies key design tensions and expert recommendations for integrating real-time wearable stress detection with adaptive LLM dialogue.
The study uses the EmBot design probe with simulated stress events to gather structured feedback on system transparency, user agency, and conversational personalization.
The paper underscores the need to balance timely, context-aware interventions with risks of notification fatigue, emphasizing robust privacy and safety protocols.

Expert Perspectives on Wearable-Triggered LLM Conversational Support for Daily Stress Management

Introduction

The integration of passive physiological sensing and generative AI-based conversational interfaces promises to advance the landscape of digital mental health support, especially for daily stress management. The work "Exploring Expert Perspectives on Wearable-Triggered LLM Conversational Support for Daily Stress Management" (2604.04915) contributes a qualitative investigation into how mental health experts perceive systems that couple wearable-triggered stress detection with LLM-driven conversational support. Using EmBot, a functional mobile application as a design probe, the study systematically elicits expert insights into both opportunities and principal design tensions at the intersection of mobile sensing, AI-based dialogue, and real-world clinical practice.

System Overview: EmBot Design and Methodology

EmBot is architected to connect continuous stress detection via wearable devices with adaptive LLM-powered conversational interventions. The application pipeline consists of four core stages: real-time detection, user feedback, LLM-mediated support, and reflection.

Stress events are detected and operationalized as notification triggers, prompting user responses to corroborate, contextualize, or reject the event. Upon confirmation or contextual input, an LLM-initiated chat provides coping prompts and reflective dialogue, grounding the conversation in the event and the user’s response. Longitudinal engagement is facilitated through a history feature, consolidating past detections and interactions.

Figure 1: Notification interface of EmBot, shown upon wearable-triggered stress detection.

This staged design operationalizes key HCI principles, including just-in-time adaptivity, user agency, transparency, and iterative sense-making. For the study, stress events were simulated to ensure consistency across expert evaluations, isolating the analysis from algorithmic detection artifacts and focusing on interaction-level feedback.

Fifteen experts (licensed clinicians, clinical researchers, computer scientists in digital health) engaged with EmBot through hands-on sessions or high-fidelity walkthroughs. Interviews were structured around pre- and post-probe phases to delineate a priori assumptions about wearables/LLM technology and probe-informed reflections grounded in real interaction experience.

Key Findings: Design Tensions and Expert Recommendations

Pre-Probe Attitudes

Experts initially viewed wearables and LLMs as distinct technologies, reiterating well-known limitations of each—wearables’ susceptibility to false positives, notification fatigue, contextual ambiguity, and LLMs' challenges with therapeutic validity and reliability. Conflation of wearables- and LLM-centric perspectives was minimal in the abstract.

Interaction-Informed Insights

Hands-on engagement with EmBot catalyzed nuanced expert perspectives on system-integration and surfaced concrete design tensions:

Translating Monitoring to Dialogue: Experts emphasized the value of event-triggered dialogue in promoting ecologically valid, context-anchored user engagement. However, they identified a requirement for richer context—location, activity, sleep status—to enhance interpretability and clinical relevance.
Transparency and Notification Management: There was a consensus that users require explicit rationale for each detection (e.g., “heart rate elevation detected”), the ability to query sensor origin, and adaptive notification pacing. Fatigue from over-triggered interventions was cited as a risk to sustained engagement.
Conversational Structure and Personalization: Experts advocated for highly structured, context-adaptive conversational flows—favoring short, targeted, situation-specific questioning rather than generic support. Preferences for chat metaphors included the use of typing indicators and potential for conversational voice.
Data Summarization and User-Clinician Collaboration: EmBot’s ability to distill and summarize sensor-derived, conversational, and self-reported data was positioned as key to minimizing user and clinician burden, contrasting with “data overload” patterns found in many personal informatics systems.
Safety, Privacy, and Escalation Protocols: The risk of misunderstandings or inappropriate advice was highlighted, especially given that auto-initiated interventions can confer a false impression of clinical authority. Robust onboarding, privacy controls, and automatic escalation or signposting to crisis resources were considered prerequisites for deployment.

Overarching Design Tensions

The study surfaces cross-cutting tensions:

Support vs. Intrusiveness: Contextual, timely intervention must be balanced with avoidance of notification fatigue and perceived intrusiveness.
Authority vs. Agency: Wearable-triggered, LLM-driven interactions may inadvertently amplify system authority, increasing the burden of safety and rigor on system design.
Specificity vs. Generalizability: Overfitting dialogue to ambiguous or weakly-signaled sensor events risks misguiding users; excessive hedging reduces perceived value.

Implications for System Design and Theoretical Models

These empirical findings inform several axes of technical and design variation in future systems:

Trigger Modality: Wearable-driven versus time-based versus user-initiated conversational turn-taking, with implications for intervention timing, context accuracy, and perceived relevance.
LLM Functional Role: Sense-making, reflective dialogue, mediation between user and clinician—each brings unique requirements for transparency, explanation, and boundary setting.
Scope of Use: Everyday stress reflection versus adjunctive clinical tool—the boundary conditions and regulatory requirements differ significantly.
Data Management: User autonomy over data deletion, summary granularity, and longitudinal reflection architectures will be central to promoting trust and privacy.

Limitations and Future Directions

The findings and design tensions identified in this work are exploratory and not algorithmically validated. All experimental triggers were simulated, disconnecting system performance from real-world detection limits. The qualitative scope, while diverse in expert perspectives, does not generalize to ultimate end-user populations or all clinical contexts.

Future research directions highlighted include:

Real-world deployments to assess detection accuracy, adaptive notification calibration, and engagement over longer periods.
Evaluating system integration into clinical workflows without exacerbating user or clinician burden.
Advancing the technical rigor of LLM conversation tailoring—using event- and context-specific priors from wearable and behavioral data streams.

Theoretical and Practical Impact

This work substantiates that real-time combination of passive sensing and LLM-based conversational interventions reframes both opportunities and challenges in digital mental health. The findings elaborate a set of design tensions that future work—particularly safety-focused, clinically integrated systems employing generative AI—must explicitly address. The expert-driven probe methodology illustrated here is a scalable framework for co-designing hybrid AI systems that couple continuous objective data with subjective interaction.

Conclusion

By leveraging expert engagement with a functioning design probe, this study highlights both design opportunities and critical tensions inherent in wearable-triggered, LLM-based conversational support systems for daily stress management. These insights delineate a multi-dimensional design space for future research, with major implications for the ongoing development, clinical translation, and ethical deployment of context-aware, AI-mediated mental health interventions (2604.04915).

Markdown Report Issue