• DocumentCode
    134348
  • Title

    An iVector extractor using pre-trained neural networks for speaker verification

  • Author

    Shanshan Zhang ; Rong Zheng ; Bo Xu

  • Author_Institution
    Interactive Digital Media Technol. Res. Center, Inst. of Autom., Beijing, China
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    73
  • Lastpage
    77
  • Abstract
    The iVector representation of speech utterances is currently widely used in speaker and language recognition tasks. In this paper, an iVector extractor using pre-trained neural networks is proposed for speaker verification. It can be viewed as an alternative to the classical total variability approach. In the proposed system, a neural network with bottleneck layer is trained with speaker labeled utterances, then we utilize the bottleneck features of the network to represent the input utterance. As a new iVector representation, it shows comparable performance with the state-of-the-art Total Variability Model (TVM) based iVector extraction system on NIST 2008 SRE. We further achieve a 10% reduction in equal error rates with combination of the proposed extraction system and the TVM system.
  • Keywords
    feature extraction; neural nets; speaker recognition; vectors; NIST 2008 SRE; TVM based iVector extraction system; equal error rates; iVector extractor; iVector representation; input utterance; language recognition tasks; pretrained neural networks; speaker labeled utterances; speaker recognition tasks; speaker verification; speech utterances; total variability model; Artificial neural networks; Data mining; Feature extraction; NIST; Speaker recognition; Speech recognition; Training; bottleneck feature; iVector extractor; speaker verification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936722
  • Filename
    6936722