• DocumentCode
    1097140
  • Title

    In-Set/Out-of-Set Speaker Recognition Under Sparse Enrollment

  • Author

    Prakash, Vinod ; Hansen, John H L

  • Volume
    15
  • Issue
    7
  • fYear
    2007
  • Firstpage
    2044
  • Lastpage
    2052
  • Abstract
    In this paper, the problem of identifying in-set versus out-of-set speakers using extremely limited enrollment data is addressed. The recognition objective is to form a binary decision regarding an input speaker as being a legitimate member of a set of enrolled speakers or not. Here, the emphasis is on low enrollment (about 5 sec of speech for each enrolled speaker) and test data durations (2-8 sec), in a text-independent scenario. In order to overcome the limited enrollment, data from speakers that are acoustically close to a given in-set speaker are used to form an informative prior (base model) for speaker adaptation. Score normalization for in-set systems is addressed, and the difficulty of using conventional score normalization schemes for in-set speaker recognition is highlighted. Distribution scaling based score normalization techniques are developed specifically for the in-set/out-of-set problem and compared against existing score normalization schemes used in open-set speaker recognition. Experiments are performed using the following three separate corpora: (1) noise-free TIMIT; (2) noisy in-vehicle CU-move; and (3) the NIST-SRE-2006 database. Experimental results show a consistent increase in system performance for the proposed techniques.
  • Keywords
    binary decision diagrams; speaker recognition; NIST-SRE-2006 database; binary decision; distribution scaling; in-set speaker recognition; noise-free TIMIT; noisy in-vehicle CU-move; out-of-set speaker recognition; score normalization; sparse enrollment; speaker adaptation; Acoustic testing; Adaptation model; Databases; Loudspeakers; Robustness; Speaker recognition; Speech analysis; Streaming media; System performance; Training data; NIST-SRE; binary classification; cohort speakers; in-set/out-of-set; in-vehicle CU-move; limited training data; score normalization; speaker recognition;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2007.902058
  • Filename
    4291611