DocumentCode :
1392583
Title :
Joint Uncertainty Decoding With Predictive Methods for Noise Robust Speech Recognition
Author :
Xu, Haitian ; Gales, Mark J F ; Chin, K.K.
Author_Institution :
Toshiba Res. Eur., Ltd., Cambridge, UK
Volume :
19
Issue :
6
fYear :
2011
Firstpage :
1665
Lastpage :
1676
Abstract :
Model-based noise compensation techniques are a powerful approach to improve speech recognition performance in noisy environments. However, one of the major issues with these schemes is that they are computationally expensive. Though techniques have been proposed to address this problem, they often result in degradations in performance. This paper proposes a new, highly flexible, approach which allows the computational load required for noise compensation to be controlled while maintaining good performance. The scheme applies the improved joint uncertainty decoding with the predictive linear transform framework. The final compensation is implemented as a set of linear transforms of the features, decoupling the computational cost of compensation from the complexity of the recognition system acoustic models. Furthermore, by using linear transforms, changes in the correlations in the feature vector can also be efficiently modeled. The proposed methods can be easily applied in an adaptive training scheme, including discriminative adaptive training. The performance of the approach is compared to a number of standard schemes on Aurora 2 as well as in-car speech recognition tasks. Results indicate that the proposed scheme is an attractive alternative to existing approaches.
Keywords :
computational complexity; decoding; speech recognition; transforms; acoustic models; discriminative adaptive training; in-car speech recognition tasks; joint uncertainty decoding; model-based noise compensation techniques; noise compensation; noise robust speech recognition; predictive linear transform framework; Adaptation model; Computational modeling; Hidden Markov models; Joints; Noise; Speech; Transforms; Adaptive training; constrained maximum-likelihood linear regression (CMLLR); joint uncertainty decoding; vector Taylor series (VTS);
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2096214
Filename :
5654579
Link To Document :
بازگشت