• DocumentCode
    730703
  • Title

    Voice activity detection using subband noncircularity

  • Author

    Wisdom, Scott ; Okopal, Greg ; Atlas, Les ; Pitton, James

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4505
  • Lastpage
    4509
  • Abstract
    Many voice activity detection (VAD) systems use the magnitude of complex-valued spectral representations. However, using only the magnitude often does not fully characterize the statistical behavior of the complex values. We present two novel methods for performing VAD on single- and dual-channel audio that do completely account for the second-order statistical behavior of complex data. Our methods exploit the second-order noncircularity (also known as impropriety) of complex subbands of speech and noise. Since speech tends to be more improper than noise, higher impropriety suggests speech activity. Our single-channel method is blind in the sense that it is unsupervised and, unlike many VAD systems, does not rely on non-speech periods for noise parameter estimation. Our methods achieve improved performance over other state-of-the-art magnitude-based VADs on the QUT-NOISE-TIMIT corpus, which indicates that impropriety is a compelling new feature for voice activity detection.
  • Keywords
    audio signal processing; signal detection; speech processing; statistical analysis; QUT-NOISE-TIMIT corpus; VAD systems; complex-valued spectral representation magnitude; dual-channel audio; noise parameter estimation; second-order noncircularity; second-order statistical behavior; single-channel audio; speech activity; subband noncircularity; voice activity detection systems; Frequency estimation; Noise measurement; Signal to noise ratio; Speech; Speech processing; Time-frequency analysis; Voice activity detection; complex-valued data; second-order statistics; spectral impropriety;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178823
  • Filename
    7178823