Title :
Feature Enhancement for Noisy Speech Recognition With a Time-Variant Linear Predictive HMM Structure
Author :
Deng, Jianping ; Bouchard, Martin ; Yeap, Tet Hin
Author_Institution :
Sch. of Inf. Technol. & Eng., Univ. of Ottawa, Ottawa, ON
fDate :
7/1/2008 12:00:00 AM
Abstract :
This paper presents a new approach for speech feature enhancement in the log-spectral domain for noisy speech recognition. A switching linear dynamic model (SLDM) is explored as a parametric model for the clean speech distribution. Each multivariate linear dynamic model (LDM) is associated with the hidden state of a hidden Markov model (HMM) as an attempt to describe the temporal correlations among adjacent frames of speech features. The state transition on the Markov chain is the process of activating a different LDM or activating some of them simultaneously by different probabilities generated by the HMM. Rather than holding a transition probability for the whole process, a connectionist model is employed to learn the time variant transition probabilities. With the resulting SLDM as the speech model and with a model for the noise, speech and noise are jointly tracked by means of switching Kalman filtering. Comprehensive experiments are carried out using the Aurora2 database to evaluate the new algorithm. The results show that the new SLDM approach can further improve the speech feature enhancement performance in terms of noise-robust recognition accuracy, since the transition probabilities among the LDMs can be described more precisely at each time point.
Keywords :
Kalman filters; hidden Markov models; speech enhancement; speech recognition; Aurora2 database; Markov chain; clean speech distribution; hidden Markov model; log-spectral domain; multivariate linear dynamic model; noisy speech recognition; speech feature enhancement; switching Kalman filtering; switching linear dynamic model; temporal correlations; time variant transition probabilities; transition probabilities; Speech feature enhancement; speech recognition; switching linear dynamic models (SLDMs); time-variant linear predictive hidden Markov model (HMM);
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2004.924593