DocumentCode :
1691155
Title :
Speaker variability in speech based emotion models - Analysis and normalisation
Author :
Sethu, Vidhyasaharan ; Epps, Julien ; Ambikairajah, E.
Author_Institution :
Sch. of Electr. Eng. & Telecommun., Univ. of New South Wales, Sydney, NSW, Australia
fYear :
2013
Firstpage :
7522
Lastpage :
7526
Abstract :
All features commonly utilised in speech based emotion classification systems capture both emotion-specific information and speaker-specific information. This paper proposes a novel method to gauge the effect of speaker-specific information on emotion modelling based on two measures: a Monte Carlo approximation to KL divergence and an estimate of feature variability based on diagonal covariance matrices. In addition, a novel speaker normalisation technique based on joint factor analysis is also proposed. This method is analogous to channel compensation in speaker verification systems, with one significant extension. The model domain compensation is mapped back to frame-level features, allowing for use in a wider range of emotion classification frameworks and in conjuncture with other normalisation techniques. Preliminary evaluations on the IEMOCAP database suggests that the proposed technique improves the performance of GMM based classification systems based on widely employed features such as pitch, MFCCs and deltas.
Keywords :
Monte Carlo methods; covariance matrices; emotion recognition; speech recognition; IEMOCAP database; KL divergence; Monte Carlo approximation; channel compensation; diagonal covariance matrix; emotion specific information; feature variability estimation; joint factor analysis; speaker normalisation technique; speaker specific information; speaker variability; speaker verification system; speech based emotion classification; speech based emotion model; Data models; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech processing; Speech recognition; Support vector machine classification; KL divergence; emotion classification; joint factor analysis; speaker normalisation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639125
Filename :
6639125
Link To Document :
بازگشت