Title :
A novel approach to soft-mask estimation and Log-Spectral enhancement for robust speech recognition
Author :
Van Hout, Julien ; Alwan, Abeer
Author_Institution :
Electr. Eng. Dept., Univ. of California, Los Angeles, CA, USA
Abstract :
This paper describes a technique for enhancing the Mel-filtered log spectra of noisy speech, with application to noise robust speech recognition. We first compute an SNR-based soft-decision mask in the Mel-spectral domain as an indicator of speech presence. Then, we exploit the known time-frequency correlation of speech by treating this mask as an image, and performing median filtering and blurring to remove the outliers and to smooth the decision regions. This mask constitutes a set of multiplicative coefficients (ranging in [0,1]) that are used to discard the unreliable parts of the Mel-filtered log-spectrum of noisy speech. Finally, we apply Log-Spectral Flooring [1] on the liftered spectra of both clean and noisy speech so as to match their respective dynamic ranges and to emphasize the information in the spectral peaks. The noisy MFCCs computed on these modified log-spectra show an increased similarity with their corresponding clean MFCCs. Evaluation on the Aurora-2 corpus shows that the proposed approach competes with state-of-the-art front-ends, like ETSI-AFE, MVA or PNCC.
Keywords :
estimation theory; masks; median filters; speech recognition; time-frequency analysis; Aurora-2 corpus; Mel-filtered log spectra; Mel-spectral domain; clean speech; liftered spectra; log-spectral enhancement; log-spectral flooring; median blurring; median filtering; multiplicative coefficients; noisy speech; robust speech recognition; soft-decision mask; soft-mask estimation; time-frequency correlation; Estimation; Hidden Markov models; Noise; Noise measurement; Speech; Speech enhancement; Speech recognition; Feature Extraction; Mask Estimation; Median Filtering; Speech Enhancement; Speech Recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288821