DocumentCode
134274
Title
Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification
Author
Jun Wang ; Lantian Li ; Dong Wang ; Zheng, Thomas Fang
Author_Institution
Center for Speech & Language Technol., Tsinghua Univ., Beijing, China
fYear
2014
fDate
12-14 Sept. 2014
Firstpage
423
Lastpage
423
Abstract
MFCC is one of the most popular features used in speaker verification, it involves not only speaker information, but also information of contents and channels. A session-aware Fbank weighting approach has been proposed, where the Fbanks that are more sensitive to session variance are de-weighted so that speaker discriminative banks are given prominence. Most of the current researches on Fbank weighting are within the GMM-UBM framework. In this paper, we study the contribution of Fbank weighting in the state-of-the-art i-vector architecture. We found that, due to the unsupervised learned loading matrix in the i-vector model, Fbank weighting shows no advantages in i-vector systems, if the simple cosine-distance scoring is used. However, when discriminative models such as LDA/PLDA are applied, the advantage of Fbank weighting can be recovered, which leads to significant performance improvement. Meanwhile we verified that weighting parameters are well generalizable: the parameters trained with a small bilingual database can be applied successfully in another i-vector system trained with a large multi-channel database.
Keywords
matrix algebra; speaker recognition; unsupervised learning; GMM-UBM framework; I-vector architecture; MFCC; PLDA; cosine-distance scoring; generalization property; session-aware Fbank weighting approach; speaker discriminative banks; speaker verification; time-varying Fbank-weighting; unsupervised learned loading matrix; Abstracts; Databases; Educational institutions; Load modeling; Loading; Mel frequency cepstral coefficient; Speech; frequency-weighting; i-vector; speaker verification;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location
Singapore
Type
conf
DOI
10.1109/ISCSLP.2014.6936667
Filename
6936667
Link To Document