Speaker verification using kernel-based binary classifiers with binary operation derived features

Author

Hung-Shin Lee ; Yu Tso ; Yun-Fan Chang ; Hsin-Min Wang ; Shyh-Kang Jeng

Author_Institution

Dept. of Electr. Eng., Nat. Taiwan Univ., Taipei, Taiwan

fYear

2014

fDate

4-9 May 2014

Firstpage

1660

Lastpage

1664

Abstract

In this paper, we study the use of two kinds of kernel-based discriminative models, namely support vector machine (SVM) and deep neural network (DNN), for speaker verification. We treat the verification task as a binary classification problem, in which a pair of two utterances, each represented by an i-vector, is assumed to belong to either the “within-speaker” group or the “between-speaker” group. To solve the problem, we employ various binary operations to retain the basic relationship between any pair of i-vectors to form a single vector for training the discriminative models. This study also investigates the correlation of achievable performances with the number of training pairs and the various combinations of basic binary operations, using the SVM and DNN binary classifiers. The experiments are conducted on the male portion of the core task in the NIST 2005 Speaker Recognition Evaluation (SRE), and the results are competitive or even better, in terms of normalized decision cost function (minDCF) and equal error rate (EER), while compared to other non-probabilistic based models, such as the conventional speaker SVMs and the LDA-based cosine distance scoring.

Keywords

neural nets; speaker recognition; support vector machines; vectors; DNN; EER; LDA-based cosine distance scoring; NIST 2005 speaker recognition evaluation; SRE; SVM; between-speaker group; binary classification problem; binary operation; correlation; deep neural network; equal error rate; i-vector; kernel-based discriminative models; minDCF; nonprobabilistic based models; normalized decision cost function; speaker verification; support vector machine; within-speaker group; Data models; Hidden Markov models; Kernel; Speech; Support vector machines; Training; Vectors; DNN; SVM; i-vector; speaker verification;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6853880

Filename

6853880