DocumentCode
4211
Title
Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition
Author
Kallasjoki, Heikki ; Gemmeke, Jort F. ; Palomaki, Kalle J.
Author_Institution
Dept. of Signal Process. & Acoust., Aalto Univ., Aalto, Finland
Volume
22
Issue
2
fYear
2014
fDate
Feb. 2014
Firstpage
368
Lastpage
380
Abstract
We present a method of improving automatic speech recognition performance under noisy conditions by using a source separation approach to extract the underlying clean speech signal. The feature enhancement processing is complemented with heuristic estimates of the uncertainty of the source separation, that are used to further assist the recognition. The uncertainty heuristics are converted to estimates of variance for the extracted clean speech using a Gaussian Mixture Model based mapping, and applied in the decoding stage under the observation uncertainty framework. We propose six heuristics, and evaluate them using both artificial and real-world noisy data, and with acoustic models trained on clean speech, a multi-condition noisy data set, and the multi-condition set processed with the source separation front-end. Taking the uncertainty of the enhanced features into account is shown to improve recognition performance when the acoustic models are trained on unenhanced data, while training on enhanced noisy data yields the lowest error rates.
Keywords
Gaussian processes; mixture models; source separation; speech recognition; Gaussian mixture model; automatic speech recognition performance; exemplar based feature enhancement; extracted clean speech; feature enhancement processing; heuristic estimates; multicondition noisy data set; noise robust speech recognition; observation uncertainty framework; source separation approach; source separation front end; underlying clean speech signal; Acoustics; Noise; Noise measurement; Speech; Speech recognition; Uncertainty; Vectors; Exemplar-based; noise robustness; observation uncertainty; speech recognition; uncertainty estimation;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2013.2292328
Filename
6677576
Link To Document