مرکز منطقه ای اطلاع رساني علوم و فناوري - An investigation of tied-mixture GMM based triphone state clustering

DocumentCode :

3165183

Title :

An investigation of tied-mixture GMM based triphone state clustering

Author :

Wang, Guangsen ; Sim, Khe Chai

Author_Institution :

Sch. of Comput., Nat. Univ. of Singapore, Singapore, Singapore

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4717

Lastpage :

4720

Abstract :

Parameter tying is a crucial scheme for robust context dependent acoustic modeling since it takes a major role in balancing the desired model complexity and the amount of data available. In this paper, a modified decision tree state clustering scheme based on tied-mixture Gaussian Mixture Model (GMM) is proposed. Instead of using a single Gaussian untied triphone system, a tied-mixture GMM triphone system is adopted as a better acoustic model for state clustering. Meanwhile, the proposed scheme allows easy incorporation of discriminative training during clustering. Experimental results show that for a varying number of state clusters, the proposed approach consistently outperforms the standard single Gaussian based state tying. The best WER performance has a 10.5% relative improvement over the conventional decision tree clustering and the proposed scheme achieves its best performance using a much smaller number of state clusters. Moreover, detailed analyses reveal that the proposed GMM clustering has a better state distribution which leads to 1) better frame-state alignments 2) better phonetic question selections. These two factors may make the proposed approach superior for clustering.

Keywords :

Gaussian processes; acoustic signal processing; computational complexity; decision trees; pattern clustering; speech recognition; training; Gaussian untied triphone system; WER performance; acoustic model; discriminative training; frame-state alignments; model complexity; modified decision tree state clustering scheme; phonetic question selections; robust context dependent acoustic modeling; state distribution; tied-mixture GMM-based triphone state clustering; tied-mixture Gaussian Mixture Model; Context; Data models; Decision trees; Hidden Markov models; Speech; Training; Training data; Tied-mixture; context dependent modeling; phonetic decision tree; state clustering;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6288972

Filename :

6288972

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3165183