• DocumentCode
    41907
  • Title

    Discriminatively Trained Sparse Inverse Covariance Matrices for Speech Recognition

  • Author

    Weibin Zhang ; Fung, Pascale

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
  • Volume
    22
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    873
  • Lastpage
    882
  • Abstract
    We propose to use acoustic models with sparse inverse covariance matrices to deal with the well-known over-fitting problem of discriminative training, especially when training data are limited. Compared with traditional diagonal or full covariance models, significant improvement by using sparse inverse covariance matrices has been achieved with maximum likelihood training. In state-of-the-art large vocabulary continuous speech recognition systems, discriminative training is commonly employed to achieve the best system performance. This paper investigates training acoustic models with sparse inverse covariance matrices using one of the most widely used discriminative training criteria-maximum mutual information (MMI). A lasso regularization term is added to the traditional objective function for MMI to automatically sparsify the inverse covariance matrices. The whole training process is then derived by maximizing the new objective function. This is achieved through iteratively maximizing a weak-sense auxiliary function. The final problem is shown to be a convex optimization problem and can be efficiently solved. Experimental results on the published Wall Street Journal and our collected Mandarin data sets show that the acoustic models with sparse inverse covariance matrices consistently outperform the conventional diagonal and full covariance models.
  • Keywords
    convex programming; covariance matrices; iterative methods; maximum likelihood estimation; speech recognition; LASSO regularization term; MMI; Mandarin data sets; convex optimization problem; diagonal covariance models; discriminative training; discriminative training criteria-maximum mutual information; discriminatively trained sparse inverse covariance matrices; full covariance models; large vocabulary continuous speech recognition systems; maximum likelihood training; objective function; overfitting problem; training acoustic models; training data; weak-sense auxiliary function; Acoustics; Covariance matrices; Hidden Markov models; Linear programming; Mathematical model; Speech recognition; Training; Discriminative training; sparse inverse covariance matrix; speech recognition;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2312548
  • Filename
    6775248