DocumentCode :
2972054
Title :
Improving joint uncertainty decoding performance by predictive methods for noise robust speech recognition
Author :
Xu, Haitian ; Gales, Mark J F ; Chin, K.K.
Author_Institution :
Cambridge Res. Lab., Toshiba Res. Eur. Ltd., Cambridge, UK
fYear :
2009
fDate :
Nov. 13 2009-Dec. 17 2009
Firstpage :
222
Lastpage :
227
Abstract :
Model-based noise compensation techniques, such as vector Taylor series (VTS) compensation, have been applied to a range of noise robustness tasks. However one of the issues with these forms of approach is that for large speech recognition systems they are computationally expensive. To address this problem schemes such as Joint Uncertainty Decoding (JUD) have been proposed. Though computationally more efficient, the performance of the system is typically degraded. This paper proposes an alternative scheme, related to JUD, but making fewer approximations, VTS-JUD. Unfortunately this approach also removes some of the computational advantages of JUD. To address this, rather than using VTS-JUD directly, it is used instead to obtain statistics to estimate a predictive linear transform, PCMLLR. This is both computationally efficient and limits some of the issues associated with the diagonal covariance matrices typically used with schemes such as VTS. PCMLLR can also be simply used within an adaptive training framework (PAT). The performance of the VTS-JUD, PCMLLR and PAT system were compared to a number of standard approaches on an in-car speech recognition task. The proposed scheme is an attractive alternative to existing approaches.
Keywords :
covariance matrices; decoding; speech recognition; adaptive training framework; diagonal covariance matrices; in-car speech recognition task; joint uncertainty decoding performance; model-based noise compensation techniques; noise robust speech recognition; predictive linear transform; vector Taylor series compensation; Acoustic noise; Computational efficiency; Decoding; Degradation; Hidden Markov models; Noise robustness; Speech recognition; Taylor series; Uncertainty; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
Type :
conf
DOI :
10.1109/ASRU.2009.5373317
Filename :
5373317
Link To Document :
بازگشت