+ - 0:00:00
Notes for current slide
Notes for next slide
  • Interested in bayesian statistics, machine learning

Predicting Presence and Severity of Depression from Voice with Emotional Transfer Learning

Lasse Hansen

16-12-2020

1

About Me

  • Lasse Hansen

  • Master student of Cognitive Science at Aarhus University, Denmark

    • Statistics, neuroscience, cognitive psychology, social dynamics
  • Intern at Data Science 1 since late August 2020

  • Supervised by Yan-Ping Zhang, Detlef Wolf, and Riccardo Fusaroli (AU)

2
  • Interested in bayesian statistics, machine learning

Major Depressive Disorder (MDD)

3

MDD

Psychological symptoms

  • Feeling sad

  • Loss of interest and energy

  • Difficulty concentrating

4

MDD

Psychological symptoms


Physiological symptoms

  • Feeling sad
  • Loss of interest and energy
  • Difficulty concentrating


  • Fatigue

  • Stomach aches

  • Psychomotor retardation

5

Psychomotor retardation

  • Slowing of thought and speech

  • Increased tension in the vocal tract


     → Subtle changes in voice quality

6
  • Patients speak slower, more monotone, longer pauses
  • A more objective measure for screening and tracking disease progress would be useful

Detecting Depression from Voice

8

Detecting Depression from Voice

Removed audio for data privacy

9

The Project

10

The Project

  • Use emotion recognition model to predict depression

  • Controls and depression at 2 visits 6 months
    apart

  • Only those in remission at 6 month follow up

Diagnosis Gender N Hamilton mean Hamilton SD Age mean
Visit 1
Controls f 33 1.6 1.4 32.3
Controls m 9 1.8 1.1 36.3
Depression f 31 22.1 3.6 32.0
Depression m 9 21.8 3.3 34.0
Visit 2
Controls f 20 1.5 1.8 37.8
Controls m 5 3.0 1.6 35.0
Depression f 20 3.8 3.0 29.9
Depression m 5 4.8 3.7 34.9
11
  • Emo model trained on 3 datasets in English and German
  • Explain the interview content and length
  • Explain why this is different than other studies (transfer)

The Project

  • Can we predict depression based on how happy/sad their voice sounds?

  • Do patients in remission sound like depressed individuals or healthy controls?

  • Can we predict prognosis based on voice?

12

Preprocessing pipeline

Noise removalSpeaker diarizationVAD


13

Preprocessing pipeline

Noise removalSpeaker diarizationVAD



14

Preprocessing pipeline

Noise removalSpeaker diarizationVAD



15

Feature extraction

  • Extract MFCCs each 10 ms.

  • Summarize in bins of 30 seconds

16

Results

17

Results

18

Results

Depression vs controls at visit 1

19

precision (tp / tp + fp) (ie. proportion pred dep actually dep) = 75%

Results

Depression vs controls at visit 1

19

precision (tp / tp + fp) (ie. proportion pred dep actually dep) = 75%

Results

20

Results

21

Results

Effect of preprocessing

22

Results

Effect of preprocessing

22

Modelling the difference

23

Bayesian T Test (BEST)

  • Bayesian Estimation Superseedes the T Test (Kruschke, 2012)

  • Provides complete information on parameters of interest
    in the form of posterior distributions

  • Can accept the null

  • Handles extreme values better

  • Easy to incorporate mixed effects

24
  • Parameters of interest: mean, diff means, sd, diff sd, effect size
  • Accept null when certainty is high
  • t = less sensitive to outliers

Bayesian Inference

"Bayesian inference is just counting"


McElreath, 2020
25
  • Assumptions with more with ways that are consistent with data are more plausible
  • Parameters that are more consistent with data are more plausible

Bayesian Inference

26
  • Coin toss: assume fair coin

Bayesian Inference

27
  • Coin toss: assume fair coin
  • Observe 6 tails, 3 heads

Bayesian Inference

28
  • Coin toss: assume fair coin
  • Observe 6 tails, 3 heads
  • Reallocation of belief
  • Difficult with many parameters -> sampling
  • Better view of the uncertainty

BEST Results

29
  • Standard tests say no difference in diarization, but we can see there is
  • Handles missing data
  • Uncertainty is explicitly modelled
  • Better estimates (with uncertainty) with less data
  • Not just yes/no, allows you to keep estimates uncertainty
  • Better estimation of uncertainty out of sample
  • "For which predicted prob is the likelihood of the patient being depressed > 80%
  • Richer representation
  • Better knowledge of uncertainty than confidence intervals/sd

Take Home Message

  • A naïve emotion classifier can distinguish patients with
    depression and healthy controls reliably above chance level

  • Around 50% of depressed patients show marked symptoms
    in the emotional content of their voice

  • Patients who enter remission sound similar to controls

  • Bayesian methods provide a richer representation of your
    data and its uncertainty

30

31
  • missing that those not in remission did not come back :(

  • high P(happy) does not mean not depression, just that voice does not show the phenotype

    • People express emotion differently
    • MFCC does not capture all of speech
  • for diarization say how standard tests would say no difference, but with bayes we can see that it exists

Hidden bonus slides

32

Hidden bonus slides

33

About Me

  • Lasse Hansen

  • Master student of Cognitive Science at Aarhus University, Denmark

    • Statistics, neuroscience, cognitive psychology, social dynamics
  • Intern at Data Science 1 since late August 2020

  • Supervised by Yan-Ping Zhang, Detlef Wolf, and Riccardo Fusaroli (AU)

2
  • Interested in bayesian statistics, machine learning
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow