• DocumentCode
    730691
  • Title

    Employment of Subspace Gaussian Mixture Models in speaker recognition

  • Author

    Motlicek, Petr ; Dey, Subhadeep ; Madikeri, Srikanth ; Burget, Lukas

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4445
  • Lastpage
    4449
  • Abstract
    This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.
  • Keywords
    Gaussian processes; hidden Markov models; mixture models; speaker recognition; ASR; HMM/GMMs; SGMM; SGMM framework; automatic speech recognition; probabilistic generative model; speaker recognition; speaker vector representation estimation; speaker vectors; speaker verification framework; speaker verification system; subspace Gaussian mixture models; Acoustics; Adaptation models; Hidden Markov models; NIST; Speaker recognition; Speech; Speech recognition; automatic speech recognition; i-vectors; speaker recognition; subspace Gaussian mixture models;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178811
  • Filename
    7178811