DocumentCode :
1835971
Title :
Vocal emotion recognition in five languages of Assam using features based on MFCCs and Eigen Values of Autocorrelation Matrix in presence of babble noise
Author :
Kandali, Aditya Bihar ; Routray, Aurobinda ; Basu, Tapan Kumar
Author_Institution :
Dept. of Electr. Eng., Indian Inst. of Technol. Kharagpur, Kharagpur, India
fYear :
2010
fDate :
29-31 Jan. 2010
Firstpage :
1
Lastpage :
5
Abstract :
This work investigates whether vocal emotion expressions of (i) discrete emotion be distinguished from `no-emotion´ (i.e. neutral), (ii) one discrete emotion be distinguished from another, (iii) surprise, which is actually a cognitive component that could be present with any emotion, be also recognized as distinct emotion, (iv) discrete emotion be recognized cross-lingually. This study will enable us to get more information regarding nature and function of emotion. Furthermore, this work will help in developing a generalized vocal emotion recognition system, which will increase the efficiency of human-machine interaction systems. In this work, an emotional speech database consisting of short sentences of six full-blown basic emotions and neutral is created with 140 simulated utterances per speaker of five native languages of Assam. This database is validated by a Listening Test. A new feature set is proposed based on Eigen Values of Autocorrelation Matrix (EVAM) of each frame of the speech signal. The Gaussian Mixture Model (GMM) is used as classifier. The performance of the proposed feature set is compared with Mel Frequency Cepstral Coefficients (MFCCs) at sampling frequency of 8.1 kHz and with additive babble noise of 5 db and 0 db Signal-to-Noise Ratios (SNRs) under matched noise training and testing condition.
Keywords :
Gaussian processes; eigenvalues and eigenfunctions; emotion recognition; man-machine systems; matrix algebra; natural language processing; noise; signal classification; speech recognition; user interfaces; Assam; Gaussian mixture model; Mel frequency cepstral coefficients; additive babble noise; autocorrelation matrix; eigen values; emotional speech database; frequency 8.1 kHz; generalized vocal emotion recognition system; human-machine interaction system; listening test; noise figure 5 dB; Additive noise; Autocorrelation; Emotion recognition; Man machine systems; Mel frequency cepstral coefficient; Natural languages; Signal to noise ratio; Spatial databases; Speech; Testing; Eigen Values of Autocorrelation Matrix; Full-blown Basic Emotion; GMM; MFCC; Vocal Emotion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications (NCC), 2010 National Conference on
Conference_Location :
Chennai
Print_ISBN :
978-1-4244-6383-1
Type :
conf
DOI :
10.1109/NCC.2010.5430205
Filename :
5430205
Link To Document :
بازگشت