Title :
Modulation spectrum equalization for robust speech recognition
Author :
Sun, Liang-Che ; Hsu, Chang-Wen ; Lee, Lin-shan
Author_Institution :
Nat. Taiwan Univ., Taipei
Abstract :
Two approaches for modulation spectrum equalization are proposed for robust feature extraction in speech recognition. In both cases the temporal trajectories of the feature parameters are first transformed into the modulation spectrum. In the spectral histogram equalization (SHE) approach, we equalize the histogram of the modulation spectrum for each utterance to a reference histogram obtained from clean training data. In the magnitude ratio equalization (MRE) approach, we equalize the magnitude ratio of lower to higher frequency components on the modulation spectrum to a reference value also obtained from clean training data. Preliminary experimental results performed on the AURORA 2 testing environment indicate that significant performance improvements are achievable with these approaches, when integrated with cepstral mean and variance normalization (CMVN), for all testing sets A, B, and C, all types of noise, for all SNR values. We also show that the approach of magnitude ratio equalization (MRE) offers additional performance improvements when integrated with other more advanced feature normalization approaches such as histogram equalization (HEQ) and higher-order cepstral moment normalization (HOCMN).
Keywords :
cepstral analysis; feature extraction; speech recognition; statistical analysis; AURORA 2 testing environment; SNR value; cepstral mean; higher-order cepstral moment normalization; histogram equalization; magnitude ratio equalization approach; modulation spectrum equalization; robust feature extraction; robust speech recognition; spectral histogram equalization approach; temporal trajectory; variance normalization; Cepstral analysis; Feature extraction; Frequency; Histograms; Performance evaluation; Robustness; Speech recognition; Testing; Training data; Working environment noise; Modulation spectrum; feature normalization; robust feature extraction; temporal filter;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
DOI :
10.1109/ASRU.2007.4430088