DocumentCode
3703307
Title
Data selection for acoustic emotion recognition: Analyzing and comparing utterance and sub-utterance selection strategies
Author
Duc Le;Emily Mower Provost
Author_Institution
Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA
fYear
2015
Firstpage
146
Lastpage
152
Abstract
Data selection is an important component of cross-corpus training and semi-supervised/active learning. However, its effect on acoustic emotion recognition is still not well understood. In this work, we perform an in-depth exploration of various data selection strategies for emotion classification from speech using classifier agreement as the selection metric. Our methods span both the traditional utterance as well as the less explored sub-utterance level. A median unweighted average recall of 70.68%, comparable to the winner of the 2009 INTERSPEECH Emotion Challenge, was achieved on the FAU Aibo 2-class problem using less than 50% of the training data. Our results indicate that sub-utterance selection leads to slightly faster convergence and significantly more stable learning. In addition, diversifying instances in terms of classifier agreement produces a faster learning rate, whereas selecting those near the median results in higher stability. We show that the selected data instances can be explained intuitively based on their acoustic properties and position within an utterance. Our work helps provide a deeper understanding of the strengths, weaknesses, and trade-offs of different data selection strategies for speech emotion recognition.
Keywords
"Hidden Markov models","Training","Emotion recognition","Speech","Training data","Measurement","Acoustics"
Publisher
ieee
Conference_Titel
Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on
Electronic_ISBN
2156-8111
Type
conf
DOI
10.1109/ACII.2015.7344564
Filename
7344564
Link To Document