I-matrix for text-independent speaker recognition

Author

Liang He ; Jia Liu

Author_Institution

Dept. of Electron. Eng., Tsinghua Univ., Beijing, China

fYear

2013

Firstpage

7194

Lastpage

7198

Abstract

This paper proposes an i-matrix for text-independent speaker recognition. The framework of the proposed i-matrix is similar to an i-vector. However, the presented method takes short-time cepstral feature matrices as inputs to explore both cepstral feature distribution and temporal information for the recognition task in the phase of statistical modeling. In the i-matrix, the variability of an utterance is constrained by two subspaces U and V, which are estimated by an iterative method on a large database. When U and V are well built, each utterance is represented by an i-matrix. Decision function is a cosine kernel. Experiments were carried out on the tel-tel-English condition of NIST SRE 2008 core task. Compared with an i-vector-LDA, the average EER and MDCF of an i-matrix-LDA showed a relative decrease of 4.82% and 5.12% respectively.

Keywords

cepstral analysis; iterative methods; matrix algebra; speaker recognition; statistical analysis; EER; MDCF; NIST SRE 2008 core task; cepstral feature distribution; cosine kernel; i-matrix-LDA; i-vector; iterative method; short-time cepstral feature matrices; statistical modeling; tel-tel-English condition; temporal information; text-independent speaker recognition; Cepstral analysis; Covariance matrices; Equations; Feature extraction; NIST; Speaker recognition; Vectors; Gaussian mixturemodels; I-vector; i-matrix; text-independent speaker recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location

Vancouver, BC

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2013.6639059

Filename

6639059