مرکز منطقه ای اطلاع رساني علوم و فناوري - A new fast and memory effective i-vector extraction based on factor analysis of KLD derived GMM supervector

DocumentCode :

134262

Title :

A new fast and memory effective i-vector extraction based on factor analysis of KLD derived GMM supervector

Author :

Zhi-Yi Li ; Wei-Qiang Zhang ; Yao Tian ; Jia Liu

Author_Institution :

Dept. of Electron. Eng., Tsinghua Univ., Beijing, China

fYear :

2014

fDate :

12-14 Sept. 2014

Firstpage :

163

Lastpage :

167

Abstract :

At present, i-vector model has become the state-of-the-art technology for speaker recognition. It represents speech utterance to a low-dimensional fix-length compact i-vector. For some real application, i-vector extraction procedure is relatively slow and requires too much memories. Some numerical approximation based fast extraction methods have been proposed to speed up the computation and to save memory meanwhile. However they are all at the expense of more or less performance degradation. From a novel model approximation viewpoint, we first propose a novel fast i-vector extraction method based on subspace factor analysis from Kullback-Leibler divergence derived Gaussian Mixture Models supervector. Experimental results on NIST SRE datasets demonstrate that the proposed method is more faster and performs more better than all the existing methods at the similar run time ratio. Besides, due to the different modeling viewpoint, we proposed a combination method with factorized subspace based extraction. This method can avoid the accuracy degradation and even can perform better than the standard one, while its extraction speed can be 10 times faster than the standard method.

Keywords :

Gaussian processes; approximation theory; feature extraction; speaker recognition; Gaussian mixture model; KLD derived GMM supervector; Kullback-Leibler divergence; combination method; factorized subspace based extraction; i-vector extraction; i-vector extraction procedure; low-dimensional fix-length compact i-vector; numerical approximation based fast extraction method; speaker recognition; speech utterance representation; subspace factor analysis; Approximation methods; Frequency selective surfaces; NIST; Speaker recognition; Speech; Vectors; Kullback-Leibler divergence; factor analysis; fast extraction; i-vector; speaker recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location :

Singapore

Type :

conf

DOI :

10.1109/ISCSLP.2014.6936655

Filename :

6936655

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=134262