• DocumentCode
    3560595
  • Title

    Trust Region-Based Optimization for Maximum Mutual Information Estimation of HMMs in Speech Recognition

  • Author

    Cong Liu ; Yu Hu ; Li-Rong Dai ; Hui Jiang

  • Author_Institution
    iFlytek Res., Hefei, China
  • Volume
    19
  • Issue
    8
  • fYear
    2011
  • Firstpage
    2474
  • Lastpage
    2485
  • Abstract
    In this paper, we have proposed two novel optimization methods for discriminative training (DT) of hidden Markov models (HMMs) in speech recognition based on an efficient global optimization algorithm used to solve the so-called trust region (TR) problem, where a quadratic function is minimized under a spherical constraint. In the first method, maximum mutual information estimation (MMIE) of Gaussian mixture HMMs is formulated as a standard TR problem so that the efficient global optimization method can be used in each iteration to maximize the auxiliary function of discriminative training for speech recognition. In the second method, we propose to construct a new auxiliary function for DT of HMMs by adding a quadratic penalty term. The new auxiliary function is constructed to serve as first-order approximation as well as lower bound of the original discriminative objective function within a locality constraint. Due to the lower-bound property, the found optimal point of the new auxiliary function is guaranteed to improve the original discriminative objective function until it converges to a local optimum or stationary point of the objective function. Both TR-based optimization methods have been investigated on two standard large-vocabulary continuous speech recognition tasks, using the WSJ0 and Switchboard databases. Experimental results have shown that the proposed TR methods outperform the conventional EBW method in terms of convergence behavior as well as recognition performance.
  • Keywords
    Gaussian processes; approximation theory; hidden Markov models; quadratic programming; speech recognition; Gaussian mixture HMM; WSJO databases; auxiliary function; convergence behavior; discriminative objective function; discriminative training; first-order approximation; global optimization algorithm; hidden Markov models; lower-bound property; maximum mutual information estimation; objective function; quadratic penalty term; spherical constraint; standard large-vocabulary continuous speech recognition tasks; switchboard databases; trust region-based optimization; Automatic speech recognition; Hidden Markov models; Mutual information; Optimization; Discriminative training; global optimization; lower-bounded auxiliary function; trust region problem;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • Conference_Location
    4/21/2011 12:00:00 AM
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2011.2144969
  • Filename
    5753922