DocumentCode :
676907
Title :
Temporal Discrete Cosine Transform for speech emotion recognition
Author :
Popovic, Branko ; Stankovic, Isidora ; Ostrogonac, Stevan
Author_Institution :
Fac. of Tech. Sci., Univ. of Novi Sad, Novi Sad, Serbia
fYear :
2013
fDate :
2-5 Dec. 2013
Firstpage :
87
Lastpage :
90
Abstract :
Temporal Discrete Cosine Transform (TDCT) features have shown good performance in the speaker verification task, and in this paper we utilize them in speech emotion recognition. Tests were conducted on a Serbian emotional speech database, using Neural Networks (NN) as a classifier and Mel-Frequency Cepstral Coefficients (MFFC) as a reference feature set. Even though MFCC is one of the most employed techniques in emotion recognition, our results show that the TDCT features outperform MFCCs (with the first and second derivation) with any number of hidden nodes in the network, hence proving as an excellent starting feature set for recognizing emotions in South Slavic languages.
Keywords :
audio databases; discrete cosine transforms; emotion recognition; feature extraction; natural language processing; neural nets; pattern classification; speaker recognition; MFCC; Mel-frequency cepstral coefficients; Serbian emotional speech database; South Slavic languages; TDCT features; neural network classifier; reference feature set; speaker verification task; speech emotion recognition; temporal discrete cosine transform; Databases; Discrete cosine transforms; Emotion recognition; Mel frequency cepstral coefficient; Speech; Speech recognition; Vectors; emotion recognition; mel-frequency cepstral coefficients; speech; time discrete cosine transform;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cognitive Infocommunications (CogInfoCom), 2013 IEEE 4th International Conference on
Conference_Location :
Budapest
Print_ISBN :
978-1-4799-1543-9
Type :
conf
DOI :
10.1109/CogInfoCom.2013.6719219
Filename :
6719219
Link To Document :
بازگشت