Search This Blog

Wednesday, April 2, 2025

'Mental Health AI Chatbot Rivals Human-Based Therapy in Less Time'

 A generative artificial intelligence (Gen-AI)–powered therapy chatbot known as Therabot was associated with significant reductions in several mental health conditions, including major depressive disorder (MDD).

Developed by members of the investigative team, Therabot is a mobile app that allows users to interact with a digital presence they understand is not a real person. Using user prompts and conversation history, the chatbot delivers tailored dialogue, including empathetic responses and targeted questions.

In a randomized control trial (RCT) of more than 200 US participants, those who received the chatbot intervention for 4 weeks had significantly greater symptom reductions in MDD, generalized anxiety disorder (GAD), and feeding and eating disorders (EDs) than their peers who did not receive access to the app (waitlist control group) — meeting its primary outcomes.

On average, engagement with the app lasted more than 6 hours and was rated highly by patients.

photo of Nicholas Jacobson
Nicholas Jacobson, PhD

“The effect sizes weren’t just significant, they were huge and clinically meaningful — and mirrored what you’d see in a gold-standard dose of evidence-based treatment delivered by humans over a longer period of time,” senior study author Nicholas Jacobson, PhD, associate professor of biomedical data science and psychiatry at Dartmouth College's Geisel School of Medicine, Hanover, New Hampshire, told Medscape Medical News.

The results were published online on March 27 in NEJM AI.

Responds Like Human Therapist

Therabot is “an expert-fine-tuned” Gen-AI–powered chatbot created specifically for mental health treatment, with experts writing therapist-patient dialogues based on cognitive-behavioral therapy.

Jacobson, who is also a director at Dartmouth’s Center for Technology and Behavioral Health, Lebanon, New Hampshire, noted that the investigators started developing the app in 2019. It now has more than 100,000 human hours put into it through software creation and refinement.

“Therabot is designed to augment and enhance conventional mental health treatment services by delivering personalized, evidenced-based mental health interventions at scale,” the researchers wrote.

Jacobson noted that other digital interventions created for the mental health space are often more structured and not adaptive or personalized, leading to lower engagement and large dropout rates. In addition, safety and efficacy have not been well established for many of these systems, he said.

What sets this app apart is its long development history that it provides diligent oversight, and its “personalized dynamic feedback” that responds much like a human therapist, he said.

“We designed our own dataset written out with transcripts on what would be a gold-standard response to every different type of query you can imagine related to these conditions and also comorbidities,” said Jacobson.

A Starting Place

The researchers enrolled 210 adults (59.5% women; mean age, 33.9 years) with severe symptoms of MDD or GAD or at high risk for feeding and EDs. All were randomly assigned to interact daily with the chatbot intervention for 4 weeks (n = 106) or to receive no app access (waitlist, n = 104).

Jacobson noted the investigators wanted to concentrate on these three specific conditions because they are among the most common mental disorders.

“We wanted to have a starting place” that could be expanded upon in the future, including the possibility of other conditions, he added.

Daily prompts to interact with Therabot occurred throughout the 4-week treatment period. The prompts stopped after that, but the group could still access the app during the following 4-week postintervention phase.

Although the waitlist group was not given access to the app during the study period, they could gain access at the end of the follow-up at 8 weeks.

The co-primary outcomes were changes in symptoms from baseline to 4 weeks and to 8 weeks. Measures included the Patient Health Questionnaire 9, the GAD Questionnaire for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, and the Weight Concerns Scale within the Stanford-Washington University Eating Disorder.

User engagement, acceptability, and “therapeutic alliance” were all secondary outcomes. The investigators defined the latter as “the collaborative patient and therapist relationship” as measured on the Working Alliance Inventory-Short Revised (WAI-SR). Other measures included a patient satisfaction survey and the number of messages sent to the app.

Effective, Satisfying Results

Results showed that compared with the waitlist group, the chatbot group had significantly greater reductions in MDD symptoms at 4 weeks (mean change, −2.63 vs −6.13; P < .001) and 8 weeks (mean change, −4.22 vs −7.93; P < .001).

Similarly, the chatbot group also had greater reductions in symptoms of GAD at 4 weeks (mean change, −0.13 vs −2.32; P = .001) and 8 weeks (mean change, −1.11 vs −3.18; = .003) and in symptoms of EDs at both timepoints (mean changes, −1.66 vs −9.83; P = .008 and −3.7 vs −10.23; P = .03, respectively).

These improvements “were comparable to what is reported for traditional outpatient therapy, suggesting this AI-assisted approach may offer clinically meaningful benefits,” Jacobson said in a release.

Based on WAI-SR scores, the participants also, on average, “reported a therapeutic alliance comparable to norms reported in an outpatient psychotherapy sample,” the investigators reported.

Overall satisfaction with the app received an average of 5.3 on a scale, where 7 was considered highest. In addition, the app received a 6.4 for ease of use, a 5.6 for being intuitive, a 5.4 for feeling better after a session, and a 4.9 for the app being rated as “similar to a real therapist.”

The mean number of participant messages sent was 260, and the mean total amount of app interaction was 6.2 hours.

The investigators noted that they and trained clinicians examined all responses from the app, and if any inappropriate responses were given, they contacted the patient directly.

At study’s end, staff interventions were required 15 times because of safety concerns, such as after participants expressed suicidal ideation, and 13 times because of inappropriate app responses, such as providing medical advice.

First Study of Its Kind

“This is the first RCT demonstrating the effectiveness of a fully Gen-AI therapy chatbot for treating clinical-level mental health symptoms,” the investigators noted.

They credited three factors for the chatbot’s success — it was “rooted” in evidence-based psychotherapies for the three conditions treated, its unrestricted/anytime access, and “unlike existing chatbots for mental health treatment, Therabot was powered by Gen-AI, allowing for natural, highly personalized, open-ended dialogue.”

Still, lead study author Michael Heinz, MD, assistant professor of psychiatry at Dartmouth and an attending psychiatrist at Dartmouth Hitchcock Medical Center, did voice some cautions.

photo of Michael Heinz
Michael Heinz, MD

“The feature that allows AI to be so effective is also what confers its risk — patients can say anything to it and it can say anything back,” he said in the release. That’s why the various systems being developed need rigorous safety and efficacy benchmarks, as well as supervision/involvement of mental health experts, he said.

“I don’t necessarily think they need to be used with a prescription model. I just think we need human experts in the loop until we have a good understanding of their safety and efficacy,” Heinz told Medscape Medical News.

He added that human interventions weren’t often needed with Therabot, “but that is always a risk with generative AI and our study team was ready.” 

So what’s next? Although Therabot isn’t available to patients and/or clinicians currently, and remains only in the research space, the goal is to make it widely available in the next few years, Jacobson said.

“But we want to proceed judiciously. A lot of our work is to ultimately scale it, but these models carry greater risk — in part because of their flexibility. So we want to have greater oversight and further trials before we open it up,” Jacobson added, noting that could eventually include head-to-head comparisons with live providers.

More Question Than Answers

Commenting for Medscape Medical News, Paul Appelbaum, MD, practicing psychiatrist and professor of psychiatry at Columbia University, New York City, described the study as interesting, with promising results.

photo of Paul Appelbaum
Paul Appelbaum, MD

However, it was also a single study that “raises more questions than it answers about the use of AI-driven chatbots,” said Appelbaum, who was not involved with the research.

He noted that there may have been a “novelty effect” because of the intervention’s relatively short duration, and that selection bias, which the investigators mention in their paper, could have resulted in an overestimation of the effectiveness of the digital intervention for the three conditions studied.

“People who are willing to participate in a study of a chatbot may be predisposed to view technological approaches as appealing. So whether a random sample of the general population would have the same response is an open question,” Appelbaum said.

He also pointed out that the control group didn’t receive anything, and wondered how a more active control intervention would have compared to the chatbot. “Is the difference between the two groups a function of the effect from the chatbot as opposed to the negative effect of being told ‘you’re just on a waiting list?’”

Appelbaum also noted the investigators’ ongoing supervision of AI to ensure patient safety.

“I think that’s a very important caveat. There’s a temptation to read this study as indicating that we can just turn patients over to chatbots and they’ll take care of it — but that is not what happened,” he said.

Disclosures and conflicts of interest of the investigators are fully listed in the original article. Appelbaum reported no relevant financial relationships.

https://www.medscape.com/viewarticle/mental-health-ai-chatbot-rivals-human-based-therapy-less-2025a10007x3

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.