DocumentCode :
179317
Title :
A feature selection and feature fusion combination method for speaker-independent speech emotion recognition
Author :
Yun Jin ; Peng Song ; Wenming Zheng ; Li Zhao
Author_Institution :
Key Lab. of Underwater Acoust. Signal Process. of Minist. of Educ., Southeast Univ., Nanjing, China
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
4808
Lastpage :
4812
Abstract :
To enhance the recognition rate of speaker independent speech emotion recognition, a feature selection and feature fusion combination method based on multiple kernel learning is presented. Firstly, multiple kernel learning is used to obtain sparse feature subsets. The features selected at least n times are recombined into another subset named n-subset. The optimal n is determined by 10 cross-validation experiments. Secondly, feature fusion is made at the kernel level. Not only each kind of feature is associated with a kernel, but also the full feature set is associated with a kernel which is not considered in the previous studies. All of the kernels are added together to obtain a combination kernel. The final recognition rate for 7 kinds of emotions on Berlin Database is 83.10%, which outperforms state-of-the-art results and shows the effectiveness of our method. It is also proved that MFCCs play a crucial role in speech emotion recognition.
Keywords :
emotion recognition; speaker recognition; speech processing; feature fusion combination; feature selection; multiple kernel learning; sparse feature subsets; speaker-independent speech emotion recognition; Conferences; Databases; Emotion recognition; Feature extraction; Kernel; Speech; Speech recognition; feature fusion; feature selection; multiple kernel learning; speech emotion recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6854515
Filename :
6854515
Link To Document :
بازگشت