DocumentCode
1065215
Title
High-performance connected digit recognition using maximum mutual information estimation
Author
Normandin, Yves ; Cardin, Régis ; de Mori, Renato
Author_Institution
Centre de Recherche Inf., McGill Coll., Montreal, Que., Canada
Volume
2
Issue
2
fYear
1994
fDate
4/1/1994 12:00:00 AM
Firstpage
299
Lastpage
311
Abstract
Hidden markov models (HMM´s) are one of the most powerful speech recognition tools available today. Even so, the inadequacies of HMM´s as a “correct” modeling framework for speech are well known. In this context, it is argued in this paper that the maximum mutual information estimation (MMIE) formulation for training is more appropriate than maximum likelihood estimation (MLE) for reducing the error rate. Corrective MMIE training is introduced. It is a very efficient new training algorithm which uses a modified version of a discrete reestimation formula recently proposed by Gopalakrishnan et al.( see IEEE Trans. Inform. Theory, Jan. 1991). Reestimation formulas are proposed for the case of diagonal Gaussian densities and their convergence properties are experimentally demonstrated. A description of how these formulas are integrated into our training algorithm is given. Using the MMIE framework for training, it is shown how weighting the contribution of different parameter sets in the computation of output probabilities introduces substantial recognition improvements. Using the TIDIGITS connected digit corpus, a large number of experiments are performed with the ideas, techniques, and algorithms presented in this paper. These experiments show that MMIE systematically provides substantial error rate reductions with respect to MLE alone and that, thanks to the new training techniques, these results can be obtained at an acceptable computational cost. The best results obtained in the experiments were 0.29% word error rate and 0.89% string error rate on the adult portion of the corpus
Keywords
hidden Markov models; information theory; parameter estimation; speech recognition; stochastic processes; HMM; MMIE; TIDIGITS connected digit corpus; connected digit recognition; convergence properties; diagonal Gaussian densities; discrete reestimation formula; error rate reduction; hidden markov models; maximum mutual information estimation; output probabilities; speech recognition; string error rate; training algorithm; word error rate; Automatic speech recognition; Convergence; Costs; Error analysis; Hidden Markov models; Maximum likelihood decoding; Maximum likelihood estimation; Mutual information; Parameter estimation; Speech recognition;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.279279
Filename
279279
Link To Document