Title :
In-Set/Out-of-Set Speaker Recognition Under Sparse Enrollment
Author :
Prakash, Vinod ; Hansen, John H L
Abstract :
In this paper, the problem of identifying in-set versus out-of-set speakers using extremely limited enrollment data is addressed. The recognition objective is to form a binary decision regarding an input speaker as being a legitimate member of a set of enrolled speakers or not. Here, the emphasis is on low enrollment (about 5 sec of speech for each enrolled speaker) and test data durations (2-8 sec), in a text-independent scenario. In order to overcome the limited enrollment, data from speakers that are acoustically close to a given in-set speaker are used to form an informative prior (base model) for speaker adaptation. Score normalization for in-set systems is addressed, and the difficulty of using conventional score normalization schemes for in-set speaker recognition is highlighted. Distribution scaling based score normalization techniques are developed specifically for the in-set/out-of-set problem and compared against existing score normalization schemes used in open-set speaker recognition. Experiments are performed using the following three separate corpora: (1) noise-free TIMIT; (2) noisy in-vehicle CU-move; and (3) the NIST-SRE-2006 database. Experimental results show a consistent increase in system performance for the proposed techniques.
Keywords :
binary decision diagrams; speaker recognition; NIST-SRE-2006 database; binary decision; distribution scaling; in-set speaker recognition; noise-free TIMIT; noisy in-vehicle CU-move; out-of-set speaker recognition; score normalization; sparse enrollment; speaker adaptation; Acoustic testing; Adaptation model; Databases; Loudspeakers; Robustness; Speaker recognition; Speech analysis; Streaming media; System performance; Training data; NIST-SRE; binary classification; cohort speakers; in-set/out-of-set; in-vehicle CU-move; limited training data; score normalization; speaker recognition;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2007.902058