DocumentCode :
41907
Title :
Discriminatively Trained Sparse Inverse Covariance Matrices for Speech Recognition
Author :
Weibin Zhang ; Fung, Pascale
Author_Institution :
Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Volume :
22
Issue :
5
fYear :
2014
fDate :
May-14
Firstpage :
873
Lastpage :
882
Abstract :
We propose to use acoustic models with sparse inverse covariance matrices to deal with the well-known over-fitting problem of discriminative training, especially when training data are limited. Compared with traditional diagonal or full covariance models, significant improvement by using sparse inverse covariance matrices has been achieved with maximum likelihood training. In state-of-the-art large vocabulary continuous speech recognition systems, discriminative training is commonly employed to achieve the best system performance. This paper investigates training acoustic models with sparse inverse covariance matrices using one of the most widely used discriminative training criteria-maximum mutual information (MMI). A lasso regularization term is added to the traditional objective function for MMI to automatically sparsify the inverse covariance matrices. The whole training process is then derived by maximizing the new objective function. This is achieved through iteratively maximizing a weak-sense auxiliary function. The final problem is shown to be a convex optimization problem and can be efficiently solved. Experimental results on the published Wall Street Journal and our collected Mandarin data sets show that the acoustic models with sparse inverse covariance matrices consistently outperform the conventional diagonal and full covariance models.
Keywords :
convex programming; covariance matrices; iterative methods; maximum likelihood estimation; speech recognition; LASSO regularization term; MMI; Mandarin data sets; convex optimization problem; diagonal covariance models; discriminative training; discriminative training criteria-maximum mutual information; discriminatively trained sparse inverse covariance matrices; full covariance models; large vocabulary continuous speech recognition systems; maximum likelihood training; objective function; overfitting problem; training acoustic models; training data; weak-sense auxiliary function; Acoustics; Covariance matrices; Hidden Markov models; Linear programming; Mathematical model; Speech recognition; Training; Discriminative training; sparse inverse covariance matrix; speech recognition;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2014.2312548
Filename :
6775248
Link To Document :
بازگشت