DocumentCode :
730790
Title :
Supervised domain adaptation for emotion recognition from speech
Author :
Abdelwahab, Mohammed ; Busso, Carlos
Author_Institution :
Dept. of Electr. Eng., Univ. of Texas at Dallas, Richardson, TX, USA
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
5058
Lastpage :
5062
Abstract :
One of the main barriers in the deployment of speech emotion recognition systems in real applications is the lack of generalization of the emotion classifiers. The recognition performance achieved in controlled recordings drops when the models are tested with different speakers, channels, environments and domain conditions. This paper explores supervised model adaptation, which can improve the performance of systems evaluated with mismatched training and testing conditions. We address the following key questions in the context of supervised adaptation for speech emotion recognition: (a) how much labeled data is needed for adaptation to achieve good performance? (b) how important is speaker diversity in the labeled set? (c) can spontaneous acted data provide similar performance than naturalistic non-acted recordings? and (d) what is the best approach to adapt the models (domain adaptation versus incremental/online training)? We address these problems by using a multi-corpus framework where the models are trained and tested with different databases. The results indicate that even small portion of data used for adaptation can significantly improve the performance. Increasing the speaker diversity in the labeled data used for adaptation does not provide significant gain in performance. Also, we observe similar performance when the classifiers are trained with naturalistic non-acted data and spontaneous acted data.
Keywords :
emotion recognition; pattern classification; speaker recognition; emotion classifier generalization; mismatched testing condition; mismatched training condition; multicorpus framework; speaker diversity; speech emotion recognition system; supervised domain adaptation; Adaptation models; Databases; Emotion recognition; Speech; Speech recognition; Support vector machines; Training; emotion recognition; supervised domain adaptation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178934
Filename :
7178934
Link To Document :
بازگشت