DocumentCode
417292
Title
Noise robust speech recognition with a switching linear dynamic model
Author
Droppo, Jasha ; Acero, Alex
Author_Institution
Microsoft Res., Redmond, WA, USA
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
Model based feature enhancement techniques are constructed from acoustic models for speech and noise, together with a model of how the speech and noise produce the noisy observations. Most techniques incorporate either Gaussian mixture models (GMM) or hidden Markov models (HMM). This paper explores using a switching linear dynamic model (LDM) for the clean speech. The linear dynamics of the model capture the smooth time evolution of speech. The switching states of the model capture the piecewise stationary characteristics of speech. However, incorporating a switching LDM causes the enhancement problem to become intractable. With a GMM or an HMM, the enhancement running time is proportional to the length of the utterance. The switching LDM causes the running time to become exponential in the length of the utterance. To overcome this drawback, the standard generalized pseudo-Bayesian technique is used to provide an approximate solution of the enhancement problem. We present preliminary results demonstrating that, even with relatively small model sizes, substantial word error rate improvement can be achieved.
Keywords
Bayes methods; error statistics; feature extraction; speech enhancement; speech recognition; acoustic models; approximate solution; automatic speech recognition systems; exponential running time; generalized pseudo-Bayesian technique; model based feature enhancement; noise robust speech recognition; piecewise stationary characteristics; smooth time evolution; speech enhancement; switching linear dynamic model; word error rate improvement; Acoustic noise; Additive noise; Automatic speech recognition; Degradation; Error analysis; Hidden Markov models; Noise robustness; Speech enhancement; Speech processing; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326145
Filename
1326145
Link To Document