Title :
Discriminatively Trained Sparse Inverse Covariance Matrices for Speech Recognition
Author :
Weibin Zhang ; Fung, Pascale
Author_Institution :
Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Abstract :
We propose to use acoustic models with sparse inverse covariance matrices to deal with the well-known over-fitting problem of discriminative training, especially when training data are limited. Compared with traditional diagonal or full covariance models, significant improvement by using sparse inverse covariance matrices has been achieved with maximum likelihood training. In state-of-the-art large vocabulary continuous speech recognition systems, discriminative training is commonly employed to achieve the best system performance. This paper investigates training acoustic models with sparse inverse covariance matrices using one of the most widely used discriminative training criteria-maximum mutual information (MMI). A lasso regularization term is added to the traditional objective function for MMI to automatically sparsify the inverse covariance matrices. The whole training process is then derived by maximizing the new objective function. This is achieved through iteratively maximizing a weak-sense auxiliary function. The final problem is shown to be a convex optimization problem and can be efficiently solved. Experimental results on the published Wall Street Journal and our collected Mandarin data sets show that the acoustic models with sparse inverse covariance matrices consistently outperform the conventional diagonal and full covariance models.
Keywords :
convex programming; covariance matrices; iterative methods; maximum likelihood estimation; speech recognition; LASSO regularization term; MMI; Mandarin data sets; convex optimization problem; diagonal covariance models; discriminative training; discriminative training criteria-maximum mutual information; discriminatively trained sparse inverse covariance matrices; full covariance models; large vocabulary continuous speech recognition systems; maximum likelihood training; objective function; overfitting problem; training acoustic models; training data; weak-sense auxiliary function; Acoustics; Covariance matrices; Hidden Markov models; Linear programming; Mathematical model; Speech recognition; Training; Discriminative training; sparse inverse covariance matrix; speech recognition;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2014.2312548