DocumentCode
178394
Title
Analysis-by-synthesis feature estimation for robust automatic speech recognition using spectral masks
Author
Mandel, Michael I. ; Narayanan, Arun
fYear
2014
fDate
4-9 May 2014
Firstpage
2509
Lastpage
2513
Abstract
Spectral masking is a promising method for noise suppression in which regions of the spectrogram that are dominated by noise are attenuated while regions dominated by speech are preserved. It is not clear, however, how best to combine spectral masking with the non-linear processing necessary to compute automatic speech recognition features. We propose an analysis-by-synthesis approach to automatic speech recognition, which, given a spectral mask, poses the estimation of mel frequency cepstral coefficients (MFCCs) of the clean speech as an optimization problem. MFCCs are found that minimize a combination of the distance from the resynthesized clean power spectrum to the regions of the noisy spectrum selected by the mask and the negative log likelihood under an unmodified large vocabulary continuous speech recognizer. In evaluations on the Aurora4 noisy speech recognition task with both ideal and estimated masks, analysis-by-synthesis decreases both word error rates and distances to clean speech as compared to traditional approaches.
Keywords
cepstral analysis; feature extraction; optimisation; speech recognition; vocabulary; Aurora4 noisy speech recognition task; MFCC; analysis-by-synthesis feature estimation; mel frequency cepstral coefficients; negative log likelihood; noise suppression; nonlinear processing; optimization problem; robust automatic speech recognition; spectral masking; spectrogram regions; unmodified large vocabulary; word error rates; Hidden Markov models; Lattices; Noise; Optimization; Speech; Speech processing; Speech recognition; analysis-by-synthesis; large vocabulary automatic speech recognition; missing data; time-frequency masking;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6854052
Filename
6854052
Link To Document