Title :
Comparing maximum a posteriori vector quantization and Gaussian mixture models in speaker verification
Author :
Kinnunen, Tomi ; Saastamoinen, Juhani ; Hautamäki, Ville ; Vinni, Mikko ; Fränti, Pasi
Author_Institution :
Dept. of Comput. Sci. & Stat., Univ. of Joensuu, Joensuu
Abstract :
Gaussian mixture model - universal background model (GMM-UBM) is a standard reference classifier in speaker verification. We have proposed a simplified model using vector quantization (VQ-UBM). In this study, we extensively compare these two classifiers on NIST 2005, 2006 and 2008 SRE corpora, while having a standard discriminative classifier (GLDS-SVM) as a reference point. We focus on parameter setting for N-top scoring, model order, and performance for different amounts of training data. The most interesting result, against a general belief, is that GMM-UBM yields better results for short segments whereas VQ-UBM is good for long utterances. The results also suggest that maximum likelihood training of the UBM is sub-optimal, and hence, alternative ways to train the UBM should be considered.
Keywords :
maximum likelihood estimation; signal classification; speaker recognition; speech coding; support vector machines; vector quantisation; GLDS-SVM; Gaussian mixture models-universal background model; discriminative classifier; maximum a posteriori vector quantization; maximum likelihood training; speaker verification; standard reference classifier; Computer science; Feature extraction; Image processing; NIST; Speaker recognition; Speech processing; Statistics; Testing; Training data; Vector quantization; Gaussian mixture model (GMM); MAP training; MFCCs; Speaker verification; vector quantization (VQ);
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2009.4960562