Title :
A preliminary study of emotion recognition employing adaptive Gaussian mixture models with the maximum a posteriori principle
Author :
Jing-Hsiang Yang ; Jeih-weih Hung
Author_Institution :
Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
Abstract :
In this paper, we present a novel processing structure to improve the performance of the automatic speech emotion recognition. In this structure, the Gaussian mixture model (GMM) is first created for each type of emotions with speech features in the training set, which consists of the utterances produced by several speakers. Next, the emotion GMMs are further adapted via a portion of the speaker-specific data in the training set using the maximum a posteriori (MAP) criterion, and thus the resulting new GMMs are expected to be better-suited for the testing utterances produced by the specific speaker in emotion recognition in comparison with the original speaker-independent GMMs. Experimental results show that after MAP adaptation for the GMMs, the emotion recognition accuracy can be improved significantly irrespective of the selected speech feature types being mel-frequency cepstral coefficients (MFCC) or perceptual linear predictive cepstral coefficients (PLPCC).
Keywords :
Gaussian processes; emotion recognition; maximum likelihood estimation; mixture models; speech recognition; MAP criterion; MFCC; PLPCC; adaptive Gaussian mixture models; automatic speech emotion recognition accuracy; emotion GMM; maximum a posteriori principle; mel-frequency cepstral coefficients; perceptual linear predictive cepstral coefficients; speaker-independent GMM; speech features; Emotion recognition; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Speech recognition; Training; GMM; MAP adaptation; MFCC; PLPCC; emotion recognition;
Conference_Titel :
Information Science, Electronics and Electrical Engineering (ISEEE), 2014 International Conference on
Conference_Location :
Sapporo
Print_ISBN :
978-1-4799-3196-5
DOI :
10.1109/InfoSEEE.2014.6946186