Machine learning strategies for the diagnosis of depression


With variations across geography and demographics, the world is getting sadder. Between 2005 and 2015, the number of people afflicted with depression increased by 18.4 per cent, with 322 million people worldwide suffering the debilitating effects of clinical depression, chronic anxiety, and other related mental disorders, according to the World Health Organisation.

Many taboos surround the topics of suicide and depression, which makes it difficult to assess the true scale of the situation around the world. In some countries, suicide remains an underreported crime rather than an acknowledged health issue. Even where it is not a crime, some researchers have found evidence that the number of undetermined deaths (UnDs) registered by the WHO contain many “hidden” suicides.

The business world might remain indifferent, if it weren't for the sheer practical cost of the problem. The global economic impact of depression between 2010 and 2030 is forecast at $16.3 trillion. The US Centre for Disease Control (CDC) declares that the disease, which it classifies as a ”public health concern,” costs employers 200 million lost workdays each year in the US, valued somewhere between $17 and $44 billion dollars.

Those who consider depression as a “non-valid” health condition might be interested to know the extraordinary extent to which it’s a contributor to “respectable” maladies such as cardiovascular disease, stroke, and diabetes. It's acknowledged to be a debilitating condition that fuels absenteeism and forces up healthcare costs by proxy.

Patterns in the darkness

Though artificial intelligence may eventually have a part to play in the treatment of those diagnosed with depression, the current weight of depression-related AI research is geared towards using machine learning as an aid to initial diagnosis and ongoing monitoring. It's an endeavour that has an uneasy reciprocal relationship with similar commercial or cross-sector research movements in customer engagement, self-driving vehicles, human resources and security, among others.

Due to continuing stigma and feeling of shame associated with the disease, patients are often reluctant to offer themselves for diagnosis. Machine learning diagnostic systems for depression offer a novel method for observational rather than response-driven diagnosis.

However, because such potentially passive methods of analysis could be undertaken without a subject's consent, they tend to provoke public discussion when implemented, indicating a need for ethical and legislative frameworks to regulate the technology as it matures.

Taking depression diagnosis beyond the lab

Diagnostic research projects driven by machine learning modelling are classified by the extent to which the learning models that they generate depend on highly structured research. Strong results obtained in a rigid environment, however reproducible, generally lead to analysis systems that require a pre-determined set of environmental restraints.

By contrast, “context-free” detection algorithms, or systems that require little initial orientation, dataset training, applied structure, or advance knowledge, are perceived as more flexible and easier to deploy, with greater potential for popular diffusion.

However, systems of this type need to prove that they can reliably exploit a newly-identified and provable human characteristic, with enough applicability across enough people to have value in a high-volume analysis system — and to achieve scientific and social consensus around it — a loft ambition.

How we communicate a state of depression

and written words, tone of voice, changes in expression, gait, posture, involuntary body responses (such as pupil dilation, the skin's galvanic response, and heart rate), and even the way that their brains respond to certain stimuli.

The spoken and written word

The words we choose when we are depressed or suffering from anxiety are useful indicators of mental state, and form the basis of several machine learning-based approaches to diagnosis, which can vary in their methodology and core contentions.

One project may analyse spoken, transcribed text as written speech. Another might study the tonality, cadence, or frequency patterns of a subject's speech, perhaps irrespective of content. And others might combine and even add to these criteria to include some of the aforementioned ways that patients can involuntarily express a depressive state, such as posture or facial expression.

Recent research from the Massachusetts Institute of Technology (MIT) outlines two possible competing or complementary systems for obtaining context-free, speech-based depression diagnosis – analysis of the subjects' words and also of the tone of their voices. Audio recordings and raw transcribed text of subjects' responses were used as input data for the researchers' neural network model. Results indicated that pure text analysis was more effective at identifying whether the individual was depressed at all but that audio data was better able to classify how severe an affected subject's depression was on a scale from 0 to 27.

In 2017, a research group from Spain and Portugal presented findings of a study that used automatic transcriptions of anonymised speech samples to classify for depression — a submission aimed at the Audio/Visual Emotion Challenge.

In the UK, researchers from Reading University recently published findings around the extent to which “absolutist” language can indicate cognitive distortion (concluding that words such as “entirely,” “always,” and “totally” fall into this category).

Facial expression and other dispositions of the body

In 2014, the US National Institute of Health (NIH) published a review of automated and manual Facial Expression Analysis techniques, concluding “it is impossible to infer a person’s “true” feelings, intentions, and desires from a single facial expression with absolute confidence.”

Generating a “context-free” diagnosis system for depression is challenging: without background knowledge (and all that cumbersome lab equipment, dataset-building, and pre-training), an analysis system is usually walking blind into the middle of a conversation. Even an internal conversation. Without a baseline, facial analysis can lead to the wrong conclusions about a person's state of mind.

Nonetheless, a significant industry- and government-led locus of interest has developed around facial analysis, now an active area of interest in AI-driven depression diagnosis techniques.

In 2005, Netherlands-based AI startup MultiSense made headlines with a ”virtual therapist,” apparently a PR gimmick designed to attract publicity for the underlying technology: a ”perception framework” capable of tracking head movements and facial geometry, aimed at recognising depression, anxiety, and similar disorders. MultiSense is a sub-project of ICT's SimSensei initiative, which uses natural language processing and computer vision — two key fields in machine learning — to identify aberrant psychological states such as PTSD, anxiety, and depression.

In 2018, researchers from Hong Kong and Helsinki published research around using the way we walk as an indices of how we are feeling. Estimating an emotional state via a person's gait (an established tool in psychology) in this remote manner seems to raise privacy concerns if deployed at scale publicly — not least because such a system may function well even using the low resolution video often employed in CCTV systems. Gait recognition has biometric potential, which was even exploited in recent years for an entry in the Mission Impossible franchise.

Traditional biofeedback data

Pupil dilation, a stalwart of traditional medicine-based response tests, has been explored as an index of emotional state, as well as a potential indicator in autism and depression. But it generally falls into the wider and potentially more informative field of eye-tracking. One 2017 research paper claims an 84 per cent success rate using these techniques as a lie detector.


Haptics, the technological application of touch, is a relatively new field that's being fuelled by Virtual Reality and Augmented Reality wearable tech research. Here user-worn devices provide touch-based feedback, including factors such as topical pressure and temperature changes. Current haptic devices range from high-commitment to rather more casual.

One Finnish research study uses haptic monitoring to quantify users' emotional responses to a different variety of facial expressions. Haptic signals are also being considered in a wearable biofeedback context, likely aimed more at monitoring pre-diagnosed conditions than establishing an initial diagnosis.

Voice analysis

Vocal prosody, along with other speech-based features such as frequency, is a core theme in machine learning-based research into automated depression diagnosis. Several of the aforementioned studies include voice analysis components, and an Australian research team has just released new research indicating the diagnostic potential of this particular biometric.

A number of other studies concentrate exclusively on vocal analysis as an index of mental state. New York University's Langone Medical Centre has developed a machine learning dataset containing over 40,000 speech features from to PTSD-affected subjects, 200 or more of which could well provide common indicators across PTSD victims — an elaborate and lengthy feature extraction process that only an AI-driven approach could make feasible.

Social media analysis

The concise and reductive nature of social media posts make them inevitably attractive to AI-driven research projects seeking to understand human mood and emotion.

Though the availability of social media APIs for researchers is threatened by the occasional controversy, machine learning is producing some of its most interesting projects using networks such as Twitter and Facebook as publicly available datasets.

One Harvard research project analysed Instagram photos as predictive markers of depression. A study late in 2017 claimed that its AI-based algorithm is better able to predict mental illness such as PTSD via social media data than health professionals. But another paper cautioned that depression indicators in social media feeds need to be taken in the context of the user's social group and standing, further illustrating the wider problem of context when classifying markers for depression.

Sentiment analysis and emotion prediction projects that utilise social media as source data are so numerous in machine learning that there are too many projects to list. But this 2017 summary is a good introduction to the topic.

State of the art in mobile depression diagnosis and monitoring

The most widely diffused emotion recognition technology at present is the face-tracking capability of Apple's iOS mobile platform, now powered by an optimised local neural network. At a popular level, it manifests in Animoji, a system of face-tracking avatars that monitors the user's facial topology and synchronises cartoon-like icons to the emotion identified.

Apple's approach and underlying hardware could potentially feed the digitising of perceived emotional states into the more traditional health data points (such as “steps”) currently used in Apple's own Health software framework. And the principles are equally applicable to other device/OS ecostructures, such as Android and wearable activity trackers such as the Fitbit range.

A radio-based depression-monitoring wearable system has been in development at Michigan State University since 2016. Also in that year, Toronto-based startup Awake Labs began to fund a wristband designed to measure the anxiety levels of autistic subjects via traditional biofeedback indices such as electrodermal activity, heart rate, and body temperature.

The politics of emotion recognition

A 2017 report from USC Berkeley criticises the highly granular and intimate nature of data streams that disclose so much about us, arguing that they could be a candidate for governmental, corporate, or criminal abuse. The researchers prognosticate a new citizens' movement for “emotion obfuscation,” similar to growing interest in defeating casual face recognition technologies in public spaces.

Further concerns have emerged in recent years around the way technology companies might share face data with third-parties, including the sharing of “emotional history” with insurance companies.

Premature conclusions?

The movement towards AI-based depression diagnosis is essentially about identifying universal features that distinguish the condition and developing new technologies that are mobile and flexible enough to deploy at scale.

It's a speculative premise that can become confounded by local and national eccentricities. Markers for aberrant mental states are not necessarily identical among nations, and achieving consistent baselines even across non-mixed groups poses a challenge to many research projects.

In 2018, the CultureNet research project demonstrated this by analysing the facial behaviour of autistic children of different nationalities, and noting the diversity of marker expression across cultures. The researchers concluded that “due to the large difference in the distribution of engagement levels in the two cultures, the deep models trained on only one culture have limited ability to generalise to the other culture.”

However, like many other technologies that experience a sustained bubble of interest, commercial emotion-detection products tend to be made available well ahead of scientific consensus on their underlying principles, and these often prosper (or not) based more on anecdotal rather than hard evidence.

At the time of writing you can hook into a face-reading, emotion-detecting API, see what a test audience thinks of your new movie via their facial reactions, and be rejected for a job by AI, based on your apparent mood.

The scientific impetus towards categorisation frequently seems to abrade against our desire to defy categorisation, and any subsequent expectations that are based on that category. It also faces resistance because of the extent to which the nature of clinical research in this area crosses over into research with commercial rather than remedial or palliative ambitions.

Yaroslav Kuflinski, AI/ML Observer, Iflexion
Image Credit: PHOTOCREO Michal Bednarek / Shutterstock