مرکز منطقه ای اطلاع رساني علوم و فناوري - Eigentriphones for Context-Dependent Acoustic Modeling

DocumentCode :

69529

Title :

Eigentriphones for Context-Dependent Acoustic Modeling

Author :

Ko, Tae Kuk ; Mak, Brian

Author_Institution :

Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China

Volume :

Issue :

fYear :

2013

fDate :

Jun-13

Firstpage :

1285

Lastpage :

1294

Abstract :

Most automatic speech recognizers employ tied-state triphone hidden Markov models (HMM), in which the corresponding triphone states of the same base phone are tied. State tying is commonly performed with the use of a phonetic regression class tree which renders robust context-dependent modeling possible by carefully balancing the amount of training data with the degree of tying. However, tying inevitably introduces quantization error: triphones tied to the same state are not distinguishable in that state. Recently we proposed a new triphone modeling approach called eigentriphone modeling in which all triphone models are, in general, distinct. The idea is to create an eigenbasis for each base phone (or phone state) and all its triphones (or triphone states) are represented as distinct points in the space spanned by the basis. We have shown that triphone HMMs trained using model-based or state-based eigentriphones perform at least as well as conventional tied-state HMMs. In this paper, we further generalize the definition of eigentriphones over clusters of acoustic units. Our experiments on TIMIT phone recognition and the Wall Street Journal 5K-vocabulary continuous speech recognition show that eigentriphones estimated from state clusters defined by the nodes in the same phonetic regression class tree used in state tying result in further performance gain.

Keywords :

eigenvalues and eigenfunctions; hidden Markov models; pattern clustering; regression analysis; speech recognition; trees (mathematics); TIMIT phone recognition; Wall Street Journal 5K-vocabulary continuous speech; acoustic units; automatic speech recognizers; base phone; context-dependent acoustic modeling; distinct points; model-based eigentriphones; performance gain; phonetic regression class tree; quantization error; state clusters; state-based eigentriphones; tied-state HMM; tied-state triphone hidden Markov models; training data; Acoustics; Context; Context modeling; Hidden Markov models; Robustness; Speech; Speech recognition; Eigentriphone; context dependency; regularization; tied state; weighted PCA;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2013.2248722

Filename :

6470660

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=69529