Title :
Distinct triphone modeling by reference model weighting
Author :
Dongpeng Chen ; Mak, Brian
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Abstract :
State tying effectively strikes a balance between detailed modeling and robust parameter estimation for hidden Markov models (HMMs) in automatic speech recognition. However, triphone HMMs that are tied to the same state are not distinguishable in that state. Recently we proposed the idea of distinct acoustic modeling in which no states are tied. In our novel clustered-based eigentriphone modeling method, triphones (or states) are grouped into non-overlapping clusters, from each of which, an orthogonal eigenbasis is derived using weighted PCA. Then all member triphones (or states) of a cluster are projected as distinct points onto the space spanned by its eigenvectors. In this paper, we propose a new simpler training method called reference model weighting (RMW) which removes the requirement of an orthogonal basis in eigentriphone, and directly uses a set of reference model vectors in a cluster as the basis. All member model vectors are then constrained to lie in the space spanned by these reference model vectors. The difference between eigentriphone modeling and reference model weighting is analogous to the difference between eigenvoice and reference speaker weighting in speaker adaptation. The new RMW method shows consistently better performance than eigentriphone and the baseline tied-state HMMs in WSJ0 word recognition and TIMIT phoneme recognition.
Keywords :
eigenvalues and eigenfunctions; hidden Markov models; parameter estimation; principal component analysis; speech recognition; HMM; RMW; TIMIT phoneme recognition; WSJ0 word recognition; automatic speech recognition; clustered-based eigentriphone; distinct acoustic modeling; distinct triphone; eigenvectors; hidden Markov models; nonoverlapping clusters; orthogonal eigenbasis; reference model vectors; reference model weighting; reference speaker weighting; robust parameter estimation; simpler training method; speaker adaptation; weighted PCA; Acoustics; Adaptation models; Hidden Markov models; Speech; Speech recognition; Training; Vectors; acoustic modeling; eigentriphone; eigenvoice; regularization; state tying;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639050