مرکز منطقه ای اطلاع رساني علوم و فناوري - Improved Semi-Parametric Mean Trajectory Model Using Discriminatively Trained Centroids

DocumentCode :

2066214

Title :

Improved Semi-Parametric Mean Trajectory Model Using Discriminatively Trained Centroids

Author :

Xu, Ran ; Pan, Jielin ; Yan, Yonghong

Author_Institution :

ThinkIT Speech Lab. Inst. of Acoust., Chinese Acad. of Sci., Beijing, China

fYear :

2008

fDate :

16-19 Dec. 2008

Firstpage :

Lastpage :

Abstract :

In order to alleviate the limitation of "state output probability conditional independence" assumption held by Hidden Markov models (HMMs) in speech recognition, a discriminative semi-parametric trajectory model was proposed in recent years, in which both means and variances in the acoustic models are modeled as time-varying variables. The time- varying information is modeled as a weighted contribution from all the "centroids", which can be viewed as the representation of the acoustic space. In previous literatures, such centroids are often obtained by clustering the Gaussians in the baseline acoustic models to some reasonable number or by training a baseline model with fewer Gaussian components. The centroids obtained in this way are maximum likelihood estimation of the acoustic space, which are relatively weak in discriminability compared to the discriminatively trained acoustic models. In this paper, we proposed an improved semi-parametric mean trajectory model training framework, in which the centroids are first discriminatively trained by minimum phone error criterion to provide a more discriminative representation of the acoustic space. This method was evaluated on the Mandarin digit string recognition task. The experimental result shows that our proposed method improves the recognition performance by a relative string error rate reduction of 7.5% compared to the traditional discriminative semi-parametric trajectory model, and it outperforms the baseline acoustic model trained with maximum likelihood criterion by a relative string error rate reduction of 28.6%.

Keywords :

hidden Markov models; maximum likelihood estimation; speech recognition; discriminatively trained centroids; hidden Markov models; maximum likelihood estimation; semi-parametric mean trajectory model; speech recognition; time-varying information; Acoustics; Error analysis; Gaussian processes; Hidden Markov models; Laboratories; Maximum likelihood estimation; Mutual information; Radio access networks; Speech recognition; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on

Conference_Location :

Kunming

Print_ISBN :

978-1-4244-2942-4

Electronic_ISBN :

978-1-4244-2943-1

Type :

conf

DOI :

10.1109/CHINSL.2008.ECP.63

Filename :

4730317

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2066214