مرکز منطقه ای اطلاع رساني علوم و فناوري - Discriminative training of weighted polynomial vector for acoustic language recognition

DocumentCode :

3165873

Title :

Discriminative training of weighted polynomial vector for acoustic language recognition

Author :

Zhang, Ce ; Zheng, Rong ; Xu, Bo

Author_Institution :

Digital Content Technol. Res. Center, Inst. of Autom., Beijing, China

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4849

Lastpage :

4852

Abstract :

In this paper, we propose a discriminative method for the acoustic feature based language recognizer, which is a modification of the polynomial expansion in generalized linear discriminant sequence (GLDS) kernel. It is inspired by the Gaussian mixture model-support vector machine (GMM-SVM) system which has been successfully used in both speaker and language recognition. Because of the restriction of calculations in our method, it is nearly impossible to stack component dependent polynomial expansion vectors as GM-MSVM system does. Thus we introduce a set of language dependent weights to fuse these expansion vectors and utilize maximum mutual information (MMI) criterion and logistic regression to estimate the model parameters. Finally, we evaluate our method on the close-set, 30 seconds test condition of NIST LRE 2007 and up to 30% relative improvement can be achieved comparing to the baseline GLDS system.

Keywords :

Gaussian processes; regression analysis; speaker recognition; support vector machines; GM-MSVM system; Gaussian mixture model-support vector machine; MMI criterion; NIST LRE 2007; baseline GLDS system; component dependent polynomial expansion vectors; discriminative training; feature based language recognizer; generalized linear discriminant sequence kernel; language dependent weights; logistic regression; maximum mutual information criterion; model parameter estimation; speaker recognition; weighted polynomial vector; Acoustics; Kernel; Logistics; Polynomials; Support vector machines; Training; Vectors; GMM; Language recognition; maximum mutual information; multi-class logistic regression; weighted GLDS;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6289005

Filename :

6289005

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3165873