Gjallarhorn

Last updated on Dec 29, 2022

The Gjallarhorn project seeks to democratize Danish speech technology by open sourcing models and resources. We recently released a version of XLS-R-300m pretrained on 140.000 hours of Danish radio along with a model finetuned for Danish automatic speech recognition (ASR). The model outperformed the previous state-of-the-art ASR model by 20%.

The project recently received a grant from the Danish e-infrastructure Cooperation (DeiC) for computational resources to continue this line of work. During the fall 2022, we will continue training and releasing new models in collaborations with Alvenir and the Alexandra Institute.

Check out our releases on the Huggingface Hub!

Team

Lasse Hansen (PI)
Rasmus Arpe Fogh Jensen ( Alvenir)
Martin Carsten Nielsen ( Alvenir)
Søren Winkel Holm ( Alvenir)
Anders Pedersen ( Alexandra Institute)

wav2vec speech

Lasse Hansen

PhD Student in Machine Learning for Healthcare

I am PhD student at the Department of Clinical Medicine at Aarhus University. I study how to use natural language processing and machine learning to improve patient outcomes in psychiatry. I am broadly interested in applying machine learning to solve real world problems, and in advancing open-source software.