Title :
Speaker verification with the mixture of Gaussian factor analysis based representation
Author_Institution :
CMU Joint Inst. of Eng., Sun Yat-Sen Univ., Guangzhou, China
Abstract :
This paper presents a generalized i-vector representation framework using the mixture of Gaussian (MoG) factor analysis for speaker verification. Conventionally, a single standard factor analysis is adopted to generate a low rank total variability subspace where the mean supervector is assumed to be Gaussian distributed. The energy that can´t be represented by the low rank space is modeled by a single multivariate Gaussian. However, due to the sparsity of the frame level posterior probability and the short duration characteristics, some dimensions of the first-order statistics may not be Gaussian distributed. Therefore, we replace the single Gaussian with a mixture of Gaussians to better represent the residual energy. Experimental results on the NIST SRE 2010 condition 5 female task and the RSR 2015 part 1 female task show that the MoG i-vector outperforms the i-vector baseline by more than 10% relatively for both text independent and text dependent speaker verification tasks, respectively.
Keywords :
Gaussian processes; speaker recognition; text analysis; Gaussian distributed; MoG factor analysis; frame level posterior probability; generalized i-vector representation framework; mixture of Gaussian factor analysis; single multivariate Gaussian; single standard factor analysis; text dependent speaker verification; text independent speaker verification; Analytical models; Databases; Mel frequency cepstral coefficient; NIST; Noise; Probability; Speaker verification; factor analysis; i-vector; mixture of Gaussian;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178858