DocumentCode :
134262
Title :
A new fast and memory effective i-vector extraction based on factor analysis of KLD derived GMM supervector
Author :
Zhi-Yi Li ; Wei-Qiang Zhang ; Yao Tian ; Jia Liu
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
fYear :
2014
fDate :
12-14 Sept. 2014
Firstpage :
163
Lastpage :
167
Abstract :
At present, i-vector model has become the state-of-the-art technology for speaker recognition. It represents speech utterance to a low-dimensional fix-length compact i-vector. For some real application, i-vector extraction procedure is relatively slow and requires too much memories. Some numerical approximation based fast extraction methods have been proposed to speed up the computation and to save memory meanwhile. However they are all at the expense of more or less performance degradation. From a novel model approximation viewpoint, we first propose a novel fast i-vector extraction method based on subspace factor analysis from Kullback-Leibler divergence derived Gaussian Mixture Models supervector. Experimental results on NIST SRE datasets demonstrate that the proposed method is more faster and performs more better than all the existing methods at the similar run time ratio. Besides, due to the different modeling viewpoint, we proposed a combination method with factorized subspace based extraction. This method can avoid the accuracy degradation and even can perform better than the standard one, while its extraction speed can be 10 times faster than the standard method.
Keywords :
Gaussian processes; approximation theory; feature extraction; speaker recognition; Gaussian mixture model; KLD derived GMM supervector; Kullback-Leibler divergence; combination method; factorized subspace based extraction; i-vector extraction; i-vector extraction procedure; low-dimensional fix-length compact i-vector; numerical approximation based fast extraction method; speaker recognition; speech utterance representation; subspace factor analysis; Approximation methods; Frequency selective surfaces; NIST; Speaker recognition; Speech; Vectors; Kullback-Leibler divergence; factor analysis; fast extraction; i-vector; speaker recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/ISCSLP.2014.6936655
Filename :
6936655
Link To Document :
بازگشت