مرکز منطقه ای اطلاع رساني علوم و فناوري - GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection

DocumentCode :

2791533

Title :

GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection

Author :

Su, Dan ; Wu, Xihong ; Xu, Lei

Author_Institution :

Key Lab. of Machine Perception (Minist. of Educ.), Peking Univ., Beijing, China

fYear :

2010

fDate :

14-19 March 2010

Firstpage :

4890

Lastpage :

4893

Abstract :

This paper investigates the Bayesian Ying-Yang (BYY) learning for speech recognition via Gaussian mixture models (GMMs) based Hidden Markov models (HMMs). A two level procedure is proposed with the hidden Markov level trained still under the maximum likelihood principle by the Baum-Welch algorithm but with the GMMs level trained under the BYY best harmony. We proposed a new batch way EM-like Ying-Yang alternation algorithm and used it as a plug-in block to the Baum-Welch algorithm. The advantage is that number of GMM components can be automatically determined during this BYY harmony learning and that the resulted model parameters become less affected than EM-ML training by the problem of overfitting and singular solution. In comparison with the standard EM-ML training and classical model selection criterions, including BIC and AIC, speech recognition experiments in a large vocabulary task on the Hub4 broadcast news database shown that the proposed algorithm provides an improved performance and also good convergence.

Keywords :

Bayes methods; acoustic signal processing; hidden Markov models; learning (artificial intelligence); speech recognition; Baum-Welch algorithm; Bayesian Ying-Yang learning; GMM-HMM acoustic model training; Gaussian mixture model; Hidden Markov model; Hub4 broadcast; automatic model selection; speech recognition; two level procedure; vocabulary task; Auditory system; Automatic speech recognition; Bayesian methods; Convergence; Databases; Hidden Markov models; Laboratories; Maximum likelihood estimation; Speech recognition; Vocabulary; Bayesian Ying-Yang learning; GMMs; HMMs; model selection; speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location :

Dallas, TX

ISSN :

1520-6149

Print_ISBN :

978-1-4244-4295-9

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2010.5495122

Filename :

5495122

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2791533