DocumentCode :
2791533
Title :
GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection
Author :
Su, Dan ; Wu, Xihong ; Xu, Lei
Author_Institution :
Key Lab. of Machine Perception (Minist. of Educ.), Peking Univ., Beijing, China
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4890
Lastpage :
4893
Abstract :
This paper investigates the Bayesian Ying-Yang (BYY) learning for speech recognition via Gaussian mixture models (GMMs) based Hidden Markov models (HMMs). A two level procedure is proposed with the hidden Markov level trained still under the maximum likelihood principle by the Baum-Welch algorithm but with the GMMs level trained under the BYY best harmony. We proposed a new batch way EM-like Ying-Yang alternation algorithm and used it as a plug-in block to the Baum-Welch algorithm. The advantage is that number of GMM components can be automatically determined during this BYY harmony learning and that the resulted model parameters become less affected than EM-ML training by the problem of overfitting and singular solution. In comparison with the standard EM-ML training and classical model selection criterions, including BIC and AIC, speech recognition experiments in a large vocabulary task on the Hub4 broadcast news database shown that the proposed algorithm provides an improved performance and also good convergence.
Keywords :
Bayes methods; acoustic signal processing; hidden Markov models; learning (artificial intelligence); speech recognition; Baum-Welch algorithm; Bayesian Ying-Yang learning; GMM-HMM acoustic model training; Gaussian mixture model; Hidden Markov model; Hub4 broadcast; automatic model selection; speech recognition; two level procedure; vocabulary task; Auditory system; Automatic speech recognition; Bayesian methods; Convergence; Databases; Hidden Markov models; Laboratories; Maximum likelihood estimation; Speech recognition; Vocabulary; Bayesian Ying-Yang learning; GMMs; HMMs; model selection; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495122
Filename :
5495122
Link To Document :
بازگشت