A machine learning tool developed by the healthtech company Kintsugi Mindful Wellness is able to detect moderate to severe depression by analyzing voice patterns.
Writing in The Annals of Family Medicine, researchers from Kintsugi, the University of California, Berkeley, and the University of Arkansas for Medical Sciences, report that the tool was able to correctly identify depression in 71% of those diagnosed with the mental health condition.
Kintsugi’s tool was also able to correctly rule out the condition in 74% of a group of people who did not have it.
“Depression is a leading cause of disability, affecting an estimated 18 million Americans each year, with a lifetime prevalence of major depression approaching 30%,” explained lead author Alexa Mazur, previously based at Kintsugi and now at Bayesian Health, and colleagues.
“In 2016, the U.S. Preventive Services Task Force recommended universal… screening for adult patients when adequate follow-up is available. Still, depression screening rarely occurs in the outpatient setting.”
Kintsugi’s tool is designed to help clinicians diagnose depression based on changes in speech patterns known to be linked to the condition, but not to replace standard medical assessments. For example, people with depression are more likely to stutter and hesitate when they speak. They also pause for longer and more often and have a slower speech cadence than people with good mental health.
The research team first trained the machine learning tool using 25 seconds or more of recorded speech from 4,456 English speaking adults in the U.S. and Canada and then validated the test in a further 10,442 participants. All participants filled out the Patient Health Questionnaire-9 to assess depression status, which has a sensitivity and specificity of 88% for detecting depression. Moderate to severe depression was defined as a score of 10 or above with 10 counting as moderately depressed.
The Kintsugi test had a sensitivity of 71.3% and a specificity of 73.5% in the validation group, which is very comparable to other mental health tests which tend to range from 60–90% depending on the assessment.
“We recognize that future studies are needed and that there is an expectation that this technology will continue to evolve and improve,” concluded the authors.
“Future studies will be directed toward determining the acceptability of augmenting primary care workflows with machine learning technology as a clinical decision-support tool and assessing the effect of other conditions that might influence depression voice biomarker analysis.”