A presentation of my internship project while at Hoffman-La Roche on using transfer learning from emotional speech to detect depression. We trained a Mixture of Experts consisting of gradient-boosted decision tree classifiers to classify happiness and sadness in datasets of acted emotional speech in English and German. The model was applied to a dataset of interviews with Danish speaking patients with first episode depression and matched healthy controls. We observed significant seperation between the two groups, and found patients in remission to speak similarly to controls. Further, we conducted experiments on the effect of removing background noise and speaker diarization, which showed consistent levels of background noise to be crucial for consistent inferences.