DocumentCode :
69529
Title :
Eigentriphones for Context-Dependent Acoustic Modeling
Author :
Ko, Tae Kuk ; Mak, Brian
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Volume :
21
Issue :
6
fYear :
2013
fDate :
Jun-13
Firstpage :
1285
Lastpage :
1294
Abstract :
Most automatic speech recognizers employ tied-state triphone hidden Markov models (HMM), in which the corresponding triphone states of the same base phone are tied. State tying is commonly performed with the use of a phonetic regression class tree which renders robust context-dependent modeling possible by carefully balancing the amount of training data with the degree of tying. However, tying inevitably introduces quantization error: triphones tied to the same state are not distinguishable in that state. Recently we proposed a new triphone modeling approach called eigentriphone modeling in which all triphone models are, in general, distinct. The idea is to create an eigenbasis for each base phone (or phone state) and all its triphones (or triphone states) are represented as distinct points in the space spanned by the basis. We have shown that triphone HMMs trained using model-based or state-based eigentriphones perform at least as well as conventional tied-state HMMs. In this paper, we further generalize the definition of eigentriphones over clusters of acoustic units. Our experiments on TIMIT phone recognition and the Wall Street Journal 5K-vocabulary continuous speech recognition show that eigentriphones estimated from state clusters defined by the nodes in the same phonetic regression class tree used in state tying result in further performance gain.
Keywords :
eigenvalues and eigenfunctions; hidden Markov models; pattern clustering; regression analysis; speech recognition; trees (mathematics); TIMIT phone recognition; Wall Street Journal 5K-vocabulary continuous speech; acoustic units; automatic speech recognizers; base phone; context-dependent acoustic modeling; distinct points; model-based eigentriphones; performance gain; phonetic regression class tree; quantization error; state clusters; state-based eigentriphones; tied-state HMM; tied-state triphone hidden Markov models; training data; Acoustics; Context; Context modeling; Hidden Markov models; Robustness; Speech; Speech recognition; Eigentriphone; context dependency; regularization; tied state; weighted PCA;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2013.2248722
Filename :
6470660
Link To Document :
بازگشت