Title :
Robust Analysis and Weighting on MFCC Components for Speech Recognition and Speaker Identification
Author :
Zhou, Xi ; Fu, Yun ; Liu, Ming ; Hasegawa-Johnson, Mark ; Huang, Thomas S.
Author_Institution :
Univ. of Illinois at Urbana-Champaig, Urbana
Abstract :
Mismatch between training and testing data is a major error source for both automatic speech recognition (ASR) and automatic speaker identification (ASI). In this paper, we first present a statistical weighting concept to exploit the unequal sensitivity of mel-frequency cepstral coefficients (MFCC) components to against the mismatch, such as ambient noise, recording equipment, transmission channels, and inter-speaker variations. We further design a new Kullback-Leibler (KL) distance based weighting algorithm according to the proposed weighting concept to real-world problems in which the label information is often not provided. We examine our algorithm in ASR with mismatch by different speakers and also in ASI with mismatch by channel noises. Experimental results demonstrate the effectiveness and robustness of our proposed method.
Keywords :
cepstral analysis; speaker recognition; statistical analysis; Kullback-Leibler distance based weighting algorithm; ambient noise; automatic speaker identification; automatic speech recognition; interspeaker variations; mel-frequency cepstral coefficients; recording equipment; statistical weighting; transmission channel; Automatic speech recognition; Cepstral analysis; Decoding; Degradation; Hidden Markov models; Mel frequency cepstral coefficient; Robustness; Speech analysis; Speech recognition; Working environment noise;
Conference_Titel :
Multimedia and Expo, 2007 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
1-4244-1016-9
Electronic_ISBN :
1-4244-1017-7
DOI :
10.1109/ICME.2007.4284618