A Derington, H Wierstorf, A Özkil, F Eyben, F Burkhardt, BW Schuller, "Testing Speech Emotion Recognition Machine Learning Models," preprint arXiv:2312.06270, (2023). [ link ] [ paper ]

Bibtex

@article{Derington2023a,
    title   = {Testing Speech Emotion Recognition Machine Learning Models},
    author  = {Derington, Anna and Wierstorf, Hagen and \"{O}zkil, Ali
               and Eyben, Florian and Burkhardt, Felix and Schuller, Bj\"{o}rn W.},
    journal = {arXiv preprint arXiv:2312.06270},
    year    = {2023},
    url     = {https://arxiv.org/abs/2312.06270}
}

Abstract

Machine learning models for speech emotion recognition (SER) can be trained for different tasks and are usually evaluated on the basis of a few available datasets per task. Tasks could include arousal, valence, dominance, emotional categories, or tone of voice. Those models are mainly evaluated in terms of correlation or recall, and always show some errors in their predictions. The errors manifest themselves in model behaviour, which can be very different along different dimensions even if the same recall or correlation is achieved by the model. This paper investigates behavior of speech emotion recognition models with a testing framework which requires models to fulfill conditions in terms of correctness, fairness, and robustness.

Supplementary material

The software used for the tests is available at https://github.com/audeering/ser-tests and results are available at https://audeering.github.io/ser-tests/.