DocumentCode :
3599820
Title :
Joined cepstral distance features two-stage multi-class classification for emotional speech
Author :
Changqin Quan ; Bin Zhang ; Ren, Fuji
Author_Institution :
Hefei Univ. of Technol., Hefei, China
fYear :
2014
Firstpage :
91
Lastpage :
96
Abstract :
This letter presents a joined cepstral distance and voice quality feature two-stage multi-class classification with DAG-SVM for emotional speech. The Harmonic to Noise Ratio (HNR) is applied to detect the throat diseases because it can reflect characteristics of the throat. Meanwhile, these characteristics are also strong emotional basis to distinguish emotion in speech. The cepstrum and cepstral distance is able to measure differences as well, which are well used for endpoint detecting in speech signals. In this work, cepstral distance is used for measuring the similarity between frames in emotional statement and in neutral signals. The experiment shows that cepstral distance can increase the recognition rate of emotion sad, and can balance the rate of other classes of emotion except angry. Finally, aiming at the characteristics that the different emotional expression ability of these feature set is different, a two-state classification is applied to solve confusion in multi-emotion recognition. In the recognition, Chinese mandarin emotion database is used and a large training set (1134+378 utterances) ensures a powerful modeling capability for predicting emotion.
Keywords :
diseases; emotion recognition; medical signal processing; patient diagnosis; signal classification; signal detection; speech processing; support vector machines; Chinese mandarin emotion database; DAG-SVM; emotional speech; emotional statement; harmonic to noise ratio; joined cepstral distance; large training set; multiemotion recognition; neutral signals; speech signal detection; throat disease detection; two-stage multi-class classification; two-stage multiclass classification; two-state classification; voice quality feature; Accuracy; Emotion recognition; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech recognition; Cepstral distance; Emotional speech recognition; HNR; PCA; two-stage classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing and Intelligence Systems (CCIS), 2014 IEEE 3rd International Conference on
Print_ISBN :
978-1-4799-4720-1
Type :
conf
DOI :
10.1109/CCIS.2014.7175709
Filename :
7175709
Link To Document :
بازگشت