• DocumentCode
    1515340
  • Title

    Robust Voice Activity Detection Using Long-Term Signal Variability

  • Author

    Ghosh, Prasanta Kumar ; Tsiartas, Andreas ; Narayanan, Shrikanth

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
  • Volume
    19
  • Issue
    3
  • fYear
    2011
  • fDate
    3/1/2011 12:00:00 AM
  • Firstpage
    600
  • Lastpage
    613
  • Abstract
    We propose a novel long-term signal variability (LTSV) measure, which describes the degree of nonstationarity of the signal. We analyze the LTSV measure both analytically and empirically for speech and various stationary and nonstationary noises. Based on the analysis, we find that the LTSV measure can be used to discriminate noise from noisy speech signal and, hence, can be used as a potential feature for voice activity detection (VAD). We describe an LTSV-based VAD scheme and evaluate its performance under eleven types of noises and five types of signal-to-noise ratio (SNR) conditions. Comparison with standard VAD schemes demonstrates that the accuracy of the LTSV-based VAD scheme averaged over all noises and all SNRs is ~6% (absolute) better than that obtained by the best among the considered VAD schemes, namely AMR-VAD2. We also find that, at -10 dB SNR, the accuracies of VAD obtained by the proposed LTSV-based scheme and the best considered VAD scheme are 88.49% and 79.30%, respectively. This improvement in the VAD accuracy indicates the robustness of the LTSV feature for VAD at low SNR condition for most of the noises considered.
  • Keywords
    acoustic signal detection; speech recognition; long-term signal variability measure; noisy speech signal; nonstationary noises; robust voice activity detection; signal-to-noise ratio conditions; speech; stationary noises; Acoustic noise; Noise measurement; Permission; Robustness; Signal analysis; Signal detection; Signal to noise ratio; Speech analysis; Speech enhancement; Working environment noise; Acoustic signal detection; speech analysis;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2010.2052803
  • Filename
    5484460