DocumentCode :
134274
Title :
Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification
Author :
Jun Wang ; Lantian Li ; Dong Wang ; Zheng, Thomas Fang
Author_Institution :
Center for Speech & Language Technol., Tsinghua Univ., Beijing, China
fYear :
2014
fDate :
12-14 Sept. 2014
Firstpage :
423
Lastpage :
423
Abstract :
MFCC is one of the most popular features used in speaker verification, it involves not only speaker information, but also information of contents and channels. A session-aware Fbank weighting approach has been proposed, where the Fbanks that are more sensitive to session variance are de-weighted so that speaker discriminative banks are given prominence. Most of the current researches on Fbank weighting are within the GMM-UBM framework. In this paper, we study the contribution of Fbank weighting in the state-of-the-art i-vector architecture. We found that, due to the unsupervised learned loading matrix in the i-vector model, Fbank weighting shows no advantages in i-vector systems, if the simple cosine-distance scoring is used. However, when discriminative models such as LDA/PLDA are applied, the advantage of Fbank weighting can be recovered, which leads to significant performance improvement. Meanwhile we verified that weighting parameters are well generalizable: the parameters trained with a small bilingual database can be applied successfully in another i-vector system trained with a large multi-channel database.
Keywords :
matrix algebra; speaker recognition; unsupervised learning; GMM-UBM framework; I-vector architecture; MFCC; PLDA; cosine-distance scoring; generalization property; session-aware Fbank weighting approach; speaker discriminative banks; speaker verification; time-varying Fbank-weighting; unsupervised learned loading matrix; Abstracts; Databases; Educational institutions; Load modeling; Loading; Mel frequency cepstral coefficient; Speech; frequency-weighting; i-vector; speaker verification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/ISCSLP.2014.6936667
Filename :
6936667
Link To Document :
بازگشت