Discriminative adaptive training with VTS and JUD

Author

Flego, F. ; Gales, M.J.F.

Author_Institution

Eng. Dept., Cambridge Univ., Cambridge, UK

fYear

2009

fDate

Nov. 13 2009-Dec. 17 2009

Firstpage

170

Lastpage

175

Abstract

Adaptive training is a powerful approach for building speech recognition systems on non-homogeneous training data. Recently approaches based on predictive model-based compensation schemes, such as joint uncertainty decoding (JUD) and vector Taylor series (VTS), have been proposed. This paper reviews these model-based compensation schemes and relates them to factor-analysis style systems. Forms of maximum likelihood (ML) adaptive training with these approaches are described, based on both second-order optimisation schemes and expectation maximisation (EM). However, discriminative training is used in many state-of-the-art speech recognition. Hence, this paper proposes discriminative adaptive training with predictive model-compensation approaches for noise robust speech recognition. This training approach is applied to both JUD and VTS compensation with minimum phone error training. A large scale multi-environment training configuration is used and the systems evaluated on a range of in-car collected data tasks.

Keywords

expectation-maximisation algorithm; optimisation; speech coding; speech recognition; JUD compensation; VTS compensation; discriminative adaptive training; expectation maximisation; factor-analysis style systems; joint uncertainty decoding; large scale multienvironment training configuration; maximum likelihood adaptive training; minimum phone error training; nonhomogeneous training data; predictive model-based compensation schemes; second-order optimisation schemes; speech recognition systems; vector Taylor series; Acoustic noise; Background noise; Maximum likelihood estimation; Maximum likelihood linear regression; Parameter estimation; Power system modeling; Predictive models; Speech recognition; Training data; Uncertainty;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location

Merano

Print_ISBN

978-1-4244-5478-5

Electronic_ISBN

978-1-4244-5479-2

Type

conf

DOI

10.1109/ASRU.2009.5373266

Filename

5373266