Title :
Trust Region-Based Optimization for Maximum Mutual Information Estimation of HMMs in Speech Recognition
Author :
Cong Liu ; Yu Hu ; Li-Rong Dai ; Hui Jiang
Author_Institution :
iFlytek Res., Hefei, China
Abstract :
In this paper, we have proposed two novel optimization methods for discriminative training (DT) of hidden Markov models (HMMs) in speech recognition based on an efficient global optimization algorithm used to solve the so-called trust region (TR) problem, where a quadratic function is minimized under a spherical constraint. In the first method, maximum mutual information estimation (MMIE) of Gaussian mixture HMMs is formulated as a standard TR problem so that the efficient global optimization method can be used in each iteration to maximize the auxiliary function of discriminative training for speech recognition. In the second method, we propose to construct a new auxiliary function for DT of HMMs by adding a quadratic penalty term. The new auxiliary function is constructed to serve as first-order approximation as well as lower bound of the original discriminative objective function within a locality constraint. Due to the lower-bound property, the found optimal point of the new auxiliary function is guaranteed to improve the original discriminative objective function until it converges to a local optimum or stationary point of the objective function. Both TR-based optimization methods have been investigated on two standard large-vocabulary continuous speech recognition tasks, using the WSJ0 and Switchboard databases. Experimental results have shown that the proposed TR methods outperform the conventional EBW method in terms of convergence behavior as well as recognition performance.
Keywords :
Gaussian processes; approximation theory; hidden Markov models; quadratic programming; speech recognition; Gaussian mixture HMM; WSJO databases; auxiliary function; convergence behavior; discriminative objective function; discriminative training; first-order approximation; global optimization algorithm; hidden Markov models; lower-bound property; maximum mutual information estimation; objective function; quadratic penalty term; spherical constraint; standard large-vocabulary continuous speech recognition tasks; switchboard databases; trust region-based optimization; Automatic speech recognition; Hidden Markov models; Mutual information; Optimization; Discriminative training; global optimization; lower-bounded auxiliary function; trust region problem;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Conference_Location :
4/21/2011 12:00:00 AM
DOI :
10.1109/TASL.2011.2144969