مرکز منطقه ای اطلاع رساني علوم و فناوري - Modulation spectrum equalization for robust speech recognition

DocumentCode :

2768661

Title :

Modulation spectrum equalization for robust speech recognition

Author :

Sun, Liang-Che ; Hsu, Chang-Wen ; Lee, Lin-shan

Author_Institution :

Nat. Taiwan Univ., Taipei

fYear :

2007

fDate :

9-13 Dec. 2007

Firstpage :

Lastpage :

Abstract :

Two approaches for modulation spectrum equalization are proposed for robust feature extraction in speech recognition. In both cases the temporal trajectories of the feature parameters are first transformed into the modulation spectrum. In the spectral histogram equalization (SHE) approach, we equalize the histogram of the modulation spectrum for each utterance to a reference histogram obtained from clean training data. In the magnitude ratio equalization (MRE) approach, we equalize the magnitude ratio of lower to higher frequency components on the modulation spectrum to a reference value also obtained from clean training data. Preliminary experimental results performed on the AURORA 2 testing environment indicate that significant performance improvements are achievable with these approaches, when integrated with cepstral mean and variance normalization (CMVN), for all testing sets A, B, and C, all types of noise, for all SNR values. We also show that the approach of magnitude ratio equalization (MRE) offers additional performance improvements when integrated with other more advanced feature normalization approaches such as histogram equalization (HEQ) and higher-order cepstral moment normalization (HOCMN).

Keywords :

cepstral analysis; feature extraction; speech recognition; statistical analysis; AURORA 2 testing environment; SNR value; cepstral mean; higher-order cepstral moment normalization; histogram equalization; magnitude ratio equalization approach; modulation spectrum equalization; robust feature extraction; robust speech recognition; spectral histogram equalization approach; temporal trajectory; variance normalization; Cepstral analysis; Feature extraction; Frequency; Histograms; Performance evaluation; Robustness; Speech recognition; Testing; Training data; Working environment noise; Modulation spectrum; feature normalization; robust feature extraction; temporal filter;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on

Conference_Location :

Kyoto

Print_ISBN :

978-1-4244-1746-9

Electronic_ISBN :

978-1-4244-1746-9

Type :

conf

DOI :

10.1109/ASRU.2007.4430088

Filename :

4430088

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2768661