• DocumentCode
    730702
  • Title

    Improvements to the IBM speech activity detection system for the DARPA RATS program

  • Author

    Thomas, Samuel ; Saon, George ; Van Segbroeck, Maarten ; Narayanan, Shrikanth S.

  • Author_Institution
    IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4500
  • Lastpage
    4504
  • Abstract
    In this paper we describe improvements to the IBM speech activity detection (SAD) system for the third phase of the DARPA RATS program. The progress during this final phase comes from jointly training convolutional and regular deep neural networks with rich time-frequency representations of speech. With these additions, the phase 3 system reduces the equal error rate (EER) significantly on both of the program´s development sets (relative improvements of 20% on dev1 and 7% on dev2) compared to an earlier phase 2 system. For the final program evaluation, the newly developed system also performs well past the program target of 3% Pmiss at 1% Pfa with a performance of 1.2% Pmiss at 1% Pfa and 0.3% Pfa at 3% Pmiss.
  • Keywords
    convolution; error statistics; neural nets; signal representation; speech processing; time-frequency analysis; DARPA RATS program; EER; IBM speech activity detection system; SAD; equal error rate; neural networks; time-frequency representations; Acoustics; Feature extraction; Hidden Markov models; Neural networks; Rats; Speech; Training; Speech activity detection; acoustic features; deep neural networks; robust speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178822
  • Filename
    7178822