مرکز منطقه ای اطلاع رساني علوم و فناوري - Improving joint uncertainty decoding performance by predictive methods for noise robust speech recognition

DocumentCode :

2972054

Title :

Improving joint uncertainty decoding performance by predictive methods for noise robust speech recognition

Author :

Xu, Haitian ; Gales, Mark J F ; Chin, K.K.

Author_Institution :

Cambridge Res. Lab., Toshiba Res. Eur. Ltd., Cambridge, UK

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

222

Lastpage :

227

Abstract :

Model-based noise compensation techniques, such as vector Taylor series (VTS) compensation, have been applied to a range of noise robustness tasks. However one of the issues with these forms of approach is that for large speech recognition systems they are computationally expensive. To address this problem schemes such as Joint Uncertainty Decoding (JUD) have been proposed. Though computationally more efficient, the performance of the system is typically degraded. This paper proposes an alternative scheme, related to JUD, but making fewer approximations, VTS-JUD. Unfortunately this approach also removes some of the computational advantages of JUD. To address this, rather than using VTS-JUD directly, it is used instead to obtain statistics to estimate a predictive linear transform, PCMLLR. This is both computationally efficient and limits some of the issues associated with the diagonal covariance matrices typically used with schemes such as VTS. PCMLLR can also be simply used within an adaptive training framework (PAT). The performance of the VTS-JUD, PCMLLR and PAT system were compared to a number of standard approaches on an in-car speech recognition task. The proposed scheme is an attractive alternative to existing approaches.

Keywords :

covariance matrices; decoding; speech recognition; adaptive training framework; diagonal covariance matrices; in-car speech recognition task; joint uncertainty decoding performance; model-based noise compensation techniques; noise robust speech recognition; predictive linear transform; vector Taylor series compensation; Acoustic noise; Computational efficiency; Decoding; Degradation; Hidden Markov models; Noise robustness; Speech recognition; Taylor series; Uncertainty; Working environment noise;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5373317

Filename :

5373317

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2972054