Discriminative adaptive training using the MPE criterion

Author

Wang, L. ; Woodland, P.C.

Author_Institution

Machine Intelligence Lab., Cambridge Univ., UK

fYear

2003

fDate

30 Nov.-3 Dec. 2003

Firstpage

279

Lastpage

284

Abstract

The paper addresses the use of discriminative training criteria for speaker adaptive training (SAT), where both the transform generation and model parameter estimation are estimated using the minimum phone error (MPE) criterion. In a similar fashion to the use of I-smoothing for standard MPE training, a smoothing technique is introduced to avoid over-training when optimizing MPE-based feature-space transforms. Experiments on a conversational telephone speech (CTS) transcription task demonstrate that MPE-based SAT models can reduce the word error rate over non-SAT MPE models by 1.0% absolute, after lattice-based MLLR adaptation. Moreover, a simplified implementation of MPE-SAT with the use of constrained MLLR, in place of MPE-estimated transforms, is also discussed.

Keywords

error statistics; learning (artificial intelligence); natural languages; optimisation; parameter estimation; smoothing methods; speech recognition; MLLR adaptation; MPE criterion; conversational telephone speech transcription; discriminative training criteria; feature-space transform optimization; minimum phone error criterion; model parameter estimation; smoothing technique; speaker adaptive training; speech recognition; transform generation estimation; word error rate; Hidden Markov models; Loudspeakers; Machine intelligence; Maximum likelihood estimation; Maximum likelihood linear regression; Parameter estimation; Smoothing methods; Speech; Statistics; Telephony;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN

0-7803-7980-2

Type

conf

DOI

10.1109/ASRU.2003.1318454

Filename

1318454