DocumentCode :
3282096
Title :
An investigation of speech-based human emotion recognition
Author :
Wang, Yongjin ; Guan, Ling
Author_Institution :
Dept. of Electr. & Comput. Eng., Ryerson Univ., Toronto, Ont., Canada
fYear :
2004
fDate :
29 Sept.-1 Oct. 2004
Firstpage :
15
Lastpage :
18
Abstract :
This paper presents our recent work on recognizing human emotion from the speech signal. The proposed recognition system was tested over a language, speaker, and context independent emotional speech database. Prosodic, Mel-frequency cepstral coefficient (MFCC), and formant frequency features are extracted from the speech utterances. We perform feature selection by using the stepwise method based on Mahalanobis distance. The selected features are used to classify the speeches into their corresponding emotional classes. Different classification algorithms including maximum likelihood classifier (MLC), Gaussian mixture model (GMM), neural network (NN), K-nearest neighbors (K-NN), and Fisher´s linear discriminant analysis (FLDA) are compared in this study. The recognition results show that FLDA gives the best recognition accuracy by using the selected features.
Keywords :
Gaussian processes; cepstral analysis; emotion recognition; maximum likelihood estimation; neural nets; signal classification; speech recognition; Gaussian mixture model; K-nearest neighbor; classification algorithm; formant frequency feature; linear discriminant analysis; maximum likelihood classifier; neural network; speech-based human emotion recognition; stepwise method; Cepstral analysis; Emotion recognition; Feature extraction; Humans; Mel frequency cepstral coefficient; Natural languages; Neural networks; Spatial databases; Speech recognition; System testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing, 2004 IEEE 6th Workshop on
Print_ISBN :
0-7803-8578-0
Type :
conf
DOI :
10.1109/MMSP.2004.1436403
Filename :
1436403
Link To Document :
بازگشت