• DocumentCode
    2017816
  • Title

    Frame selection of interview channel for NIST speaker recognition evaluation

  • Author

    Sun, Hanwu ; Bin Ma ; Haizhou Li

  • Author_Institution
    Human Language Technol. Dept., A*STAR, Singapore, Singapore
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 3 2010
  • Firstpage
    305
  • Lastpage
    308
  • Abstract
    In this paper, we study a front-end frame selection approach for the interview channel speaker recognition system. This new approach keeps the high quality speech frames and removes noisy and irrelevant speech frames for speaker modeling. For robust voice activity detection (VAD) under the different types of microphones located in the interview room, we adopt the spectral subtraction algorithm for noise reduction. An energy based frame selection algorithm is first applied to indicate the speech activity at the frame level. To overcome the summed channel effects in the interview condition, a study is conducted to effectively extract the relevant speaker´s speech frames based on VAD Tags and ASR transcript Tags provided by NIST. The eigenchannel based GMM-SVM speaker recognition system is used to evaluate the proposed method. The experiments are conducted on the NIST 2008 and NIST 2010 Speaker Recognition Evaluation interview-interview conditions. It demonstrates that the approach provides an efficient way to select high quality speech frames and the relevant speaker´s voice in the interview environment for speaker recognition.
  • Keywords
    noise abatement; speaker recognition; spectral analysis; speech processing; support vector machines; ASR transcript tags; GMM-SVM speaker recognition system; NIST speaker recognition evaluation; energy based frame selection algorithm; front-end frame selection approach; interview channel speaker recognition system; noise reduction; robust voice activity detection; spectral subtraction algorithm; speech activity; Interviews; Microphones; NIST; Speaker recognition; Speech; Speech processing; Speech recognition; GMM-SVM; NIST; Speaker recognition; distant microphone; interview channel;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-6244-5
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2010.5684886
  • Filename
    5684886