Title :
Parameter Optimization Issues for Cross-corpora Emotion Classification
Author :
Vlasenko, Bogdan ; Philippou-Hubner, David ; Wendemuth, Andreas
Author_Institution :
Center for Behavioral Brain Sci., Otto von Guericke Univ., Magdeburg, Germany
Abstract :
As speech based emotion recognition has matured to a degree where it becomes applicable within real-life conditions, it is time for a realistic view on obtainable performances. Most state-of-the-art emotion recognition methods are based on turn- and frame-level analysis independent of phonetic transcription. True speaker disjoint partitioning of training and test sets is still less common than simple cross-validation. Even speaker disjoint experiments can give only little insight into the generalization ability of modern emotion recognition engines since training and test sets used for system development usually tend to be similar as far as acoustic channel, noise overlay, and language are concerned. A considerably more realistic impression can be gathered by cross-corpora evaluation. Tuning of the emotion classification engine (feature set optimization and normalization, selection of a classification technique and corresponding parameter configuration) is an important issue of realistic evaluations. In the ideal case, an optimal classifier configuration estimated on training data should provide an outstanding recognition performance on unseen data. We therefore compare cross-corpora classification performances of optimized and non-optimized general and phonetic-pattern dependent classifiers.
Keywords :
emotion recognition; optimisation; pattern classification; speaker recognition; acoustic channel; cross corpora classification; cross corpora emotion classification; cross corpora evaluation; cross validation; emotion classification engine; emotion recognition engines; frame level analysis; generalization ability; noise overlay; normalization; optimal classifier configuration; parameter optimization; phonetic pattern dependent classifiers; phonetic transcription; set optimization; speaker disjoint experiments; speech based emotion recognition; true speaker disjoint partitioning; Acoustics; Emotion recognition; Engines; Hidden Markov models; Speech; Speech recognition; Training; EMO-DB; Emotion recognition; VAM; cross-corpora; emotion perception; emotional unit; level of arousal; parameter optimization;
Conference_Titel :
Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on
Conference_Location :
Geneva
DOI :
10.1109/ACII.2013.81