Joint maximum a posteriori adaptation of transformation and HMM parameters

Author

Siohan, Olivier ; Chesta, Cristina ; Lee, Chin-Hui

Author_Institution

Lucent Technol. Bell Labs., Murray Hill, NJ, USA

Volume

9

Issue

4

fYear

2001

fDate

5/1/2001 12:00:00 AM

Firstpage

417

Lastpage

428

Abstract

Model adaptation techniques are an efficient way to reduce the mismatch that typically occurs between the training and test condition of any speech recognizer. Adaptation techniques can usually be divided into two families of approaches. On one hand, direct model adaptation attempts to directly reestimate the model parameters, for example using MAP adaptation. Since direct adaptation only reestimates model parameters of the corresponding units appearing in the adaptation data, a large amount of such data is needed to observe any significant improvement in performance. However, nice asymptotic properties are usually observed, meaning that the performance improves as the amount of adaptation data increases. On the other hand, indirect model adaptation applies a general transformation on some clusters of model parameters. Because each individual model is transformed, the approach is quite effective when a small amount of adaptation data is available. However, as the amount of adaptation data increases, the performance improvement quickly saturates. We propose to jointly estimate model parameters and transformation parameters using a single estimation criterion based on Bayesian statistics. We show that by providing a prior distribution for the model parameters and the transformation parameters, it is possible to jointly estimate these two sets of parameters using maximum a posteriori estimation (MAP). Experimental evaluation on nonnative speaker and channel adaptation illustrates the effectiveness of the proposed approach

Keywords

Bayes methods; adaptive estimation; hidden Markov models; parameter estimation; speech recognition; Bayesian statistics; HMM parameters; adaptation data; asymptotic properties; automatic speech recognition; channel adaptation; direct model adaptation; indirect model adaptation; joint MAP adaptation; joint maximum a posteriori adaptation; maximum a posteriori estimation; model parameters reestimation; nonnative speaker adaptation; performance; speech recognizer; test condition; training condition; transformation parameters estimation; Acoustic testing; Adaptation model; Automatic speech recognition; Hidden Markov models; Laboratories; Loudspeakers; Maximum likelihood estimation; Parameter estimation; Robustness; Speech recognition;

fLanguage

English

Journal_Title

Speech and Audio Processing, IEEE Transactions on

Publisher

ieee

ISSN

1063-6676

Type

jour

DOI

10.1109/89.917687

Filename

917687