DocumentCode
3424072
Title
Modeling the dynamics of speech and noise for speech feature enhancement in ASR
Author
Windmann, Stefan ; Haeb-Umbach, Reinhold
Author_Institution
Dept. of Commun. Eng., Paderborn Univ., Paderborn
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
4409
Lastpage
4412
Abstract
In this paper a switching linear dynamical model (SLDM) approach for speech feature enhancement is improved by employing more accurate models for the dynamics of speech and noise. The model of the clean speech feature trajectory is improved by augmenting the state vector to capture information derived from the delta features. Further a hidden noise state variable is introduced to obtain a more elaborated model for the noise dynamics. Approximate Bayesian inference in the SLDM is carried out by a bank of extended Kalman filters, whose outputs are combined according to the a posteriori probability of the individual state models. Experimental results on the AURORA2 database show improved recognition accuracy.
Keywords
Bayes methods; Kalman filters; channel bank filters; speech enhancement; AURORA2 database; Bayesian inference; a posteriori probability; extended Kalman filter banks; hidden noise state variable; noise dynamics; speech feature enhancement; speech feature trajectory; switching linear dynamical model approach; Automatic speech recognition; Background noise; Cepstral analysis; Communication switching; Jacobian matrices; Speech enhancement; Speech recognition; State estimation; Switches; Vectors; ASR; SLDM; inter-frame correlation; speech feature enhancement; speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4518633
Filename
4518633
Link To Document