مرکز منطقه ای اطلاع رساني علوم و فناوري - Joint Uncertainty Decoding With Predictive Methods for Noise Robust Speech Recognition

DocumentCode :

1392583

Title :

Joint Uncertainty Decoding With Predictive Methods for Noise Robust Speech Recognition

Author :

Xu, Haitian ; Gales, Mark J F ; Chin, K.K.

Author_Institution :

Toshiba Res. Eur., Ltd., Cambridge, UK

Volume :

Issue :

fYear :

2011

Firstpage :

1665

Lastpage :

1676

Abstract :

Model-based noise compensation techniques are a powerful approach to improve speech recognition performance in noisy environments. However, one of the major issues with these schemes is that they are computationally expensive. Though techniques have been proposed to address this problem, they often result in degradations in performance. This paper proposes a new, highly flexible, approach which allows the computational load required for noise compensation to be controlled while maintaining good performance. The scheme applies the improved joint uncertainty decoding with the predictive linear transform framework. The final compensation is implemented as a set of linear transforms of the features, decoupling the computational cost of compensation from the complexity of the recognition system acoustic models. Furthermore, by using linear transforms, changes in the correlations in the feature vector can also be efficiently modeled. The proposed methods can be easily applied in an adaptive training scheme, including discriminative adaptive training. The performance of the approach is compared to a number of standard schemes on Aurora 2 as well as in-car speech recognition tasks. Results indicate that the proposed scheme is an attractive alternative to existing approaches.

Keywords :

computational complexity; decoding; speech recognition; transforms; acoustic models; discriminative adaptive training; in-car speech recognition tasks; joint uncertainty decoding; model-based noise compensation techniques; noise compensation; noise robust speech recognition; predictive linear transform framework; Adaptation model; Computational modeling; Hidden Markov models; Joints; Noise; Speech; Transforms; Adaptive training; constrained maximum-likelihood linear regression (CMLLR); joint uncertainty decoding; vector Taylor series (VTS);

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2010.2096214

Filename :

5654579

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1392583