• DocumentCode
    2707977
  • Title

    A novel instantaneous frequency-based voice activity detection for strong noisy speech

  • Author

    Shi, Wei ; Zou, Yuexian

  • Author_Institution
    Shenzhen Grad. Sch., Adv. Digital Signal Process. Lab., Peking Univ., Shenzhen, China
  • fYear
    2012
  • fDate
    6-8 June 2012
  • Firstpage
    956
  • Lastpage
    959
  • Abstract
    The development of robust voice activity detection (VAD) for strong noisy speech is a challenging task. In this paper, we propose a novel voice activity detection method under Hilbert-Huang Transform (HHT) framework by using its good ability to automatically extract signal-frequency related intrinsic mode functions (IMF) by the empirical mode decomposition (EMD), which provides us a more flexible way to select different IMFs with strong speech components. With the Hilbert transform, the instantaneous frequency (IF) can be computed. Making use of the speech characteristics of each IMF, a weighted instantaneous frequency average (WIFA) measurement is proposed and the corresponding WIFA-VAD algorithm is developed, where the VAD threshold can be automatically estimated using the first noise frames. Experiments show that the proposed WIFA-VAD can achieve comparable results at high SNR. For low SNR (e.g., -5dB and below) conditions, it is able to achieve lower false alarm ration (FAR) and missing error ratio (MER), compared with those of the conventional VAD algorithms.
  • Keywords
    Hilbert transforms; signal processing; speech processing; EMD; FAR; HHT framework; Hilbert-Huang transform; IMF; MER; SNR; VAD threshold; WIFA measurement; empirical mode decomposition; false alarm ration; instantaneous frequency-based voice activity detection; missing error ratio; robust voice activity detection; signal-frequency related intrinsic mode function; strong noisy speech; weighted instantaneous frequency average; Noise measurement; Robustness; Signal processing algorithms; Signal to noise ratio; Speech; Transforms; Voice activity detection; instantaneous frequency; strong noisy speech; weighted instantaneous frequency average;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Automation (ICIA), 2012 International Conference on
  • Conference_Location
    Shenyang
  • Print_ISBN
    978-1-4673-2238-6
  • Electronic_ISBN
    978-1-4673-2236-2
  • Type

    conf

  • DOI
    10.1109/ICInfA.2012.6246954
  • Filename
    6246954