DocumentCode :
918762
Title :
Filterbank-energy estimation using mixture and Markov models for recognition of noisy speech
Author :
Erell, Adoram ; Weintraub, Mitchel
Author_Institution :
SRI Inst., Menlo Park, CA, USA
Volume :
1
Issue :
1
fYear :
1993
fDate :
1/1/1993 12:00:00 AM
Firstpage :
68
Lastpage :
76
Abstract :
An estimation algorithm for noise robust speech recognition, the minimum mean log spectral distance (MMLSD), is presented. The estimation is matched to the recognizer by seeking to minimize the average distortion as measured by a Euclidean distance between filterbank log-energy vectors, approximating the weighted-cepstral distance used by the recognizer. The estimation is computed using a clean speech spectral probability distribution, estimated from a database, and a stationary, ARMA model for the noise. When trained on clean speech and tested with additive white noise at 10-dB SNR, the recognition accuracy with the MMLSD algorithm is comparable to that achieved with training the recognizer at the same constant 10-dB SNR. The algorithm is also highly efficient with a quasi-stationary environmental noise, recorded with a desktop microphone, and requires almost no tuning to differences between this noise and the computer-generated white noise
Keywords :
Markov processes; filtering and prediction theory; speech recognition; white noise; 10 dB; ARMA model; Euclidean distance; MMLSD algorithm; Markov models; SNR; additive white noise; average distortion; clean speech; computer-generated white noise; database; desktop microphone; estimation algorithm; filterbank energy estimation; filterbank log-energy vectors; minimum mean log spectral distance algorithm; noise robust speech recognition; noisy speech; quasistationary environmental noise; recognition accuracy; speech spectral probability distribution; weighted-cepstral distance; Distortion measurement; Distributed computing; Euclidean distance; Filter bank; Noise robustness; Probability distribution; Signal to noise ratio; Speech enhancement; Speech recognition; Working environment noise;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.221385
Filename :
221385
Link To Document :
بازگشت