Title :
Detecting Practical Speech Emotion in a Cognitive Task
Author :
Zou, Cairong ; Huang, Chengwei ; Han, Dong ; Zhao, Li
Author_Institution :
Sch. of Electron. & Inf. Eng., Foshan Univ., Foshan, China
fDate :
July 31 2011-Aug. 4 2011
Abstract :
In this paper we analysis the speech emotions related to cognitive process. An automatic system is established for detecting speech emotions including anxiety, hesitation, confidence and joy. In order to obtain a naturalistic database we use noise to induce negative emotions, sleep deprivation is also used for this purpose. The lack of sleep is an important cause for anxiety. Annotation of emotional speech is then done manually with a self evaluation for the felt emotions in each utterance. Acoustic features are extracted both for valence dimension and arousal dimension including voice quality features. For the recognition algorithm Gaussian Mixture Model is adopted for detecting each type of emotions from neutral speech. Based on the detection of each emotion in the continuous recognition of emotion states an error correcting method is proposed. With the previous emotion states and cognitive performance the detection errors in current emotion state is corrected with empirical probability. Experimental results show that our system can detect "practical" speech emotions related to cognitive process. With the proposed error correcting method the recognition performance is improved compared to the baseline system based on Gaussian Mixture Model. We believe the detection of these "practical" emotions is important for real world applications, especially for helping people cope with negative emotions in cognitive activities.
Keywords :
Gaussian processes; acoustic signal processing; cognitive systems; emotion recognition; error correction; feature extraction; sleep; Gaussian mixture model; acoustic feature extraction; anxiety; arousal dimension; cognitive process; cognitive task; confidence; emotion detection; error correcting method; hesitation; joy; naturalistic database; neutral speech; noise; sleep deprivation; speech emotion; valence dimension; voice quality; Context; Emotion recognition; Feature extraction; Hidden Markov models; Speech; Speech recognition; Support vector machine classification;
Conference_Titel :
Computer Communications and Networks (ICCCN), 2011 Proceedings of 20th International Conference on
Conference_Location :
Maui, HI
Print_ISBN :
978-1-4577-0637-0
DOI :
10.1109/ICCCN.2011.6005883