Title :
Comparison between DFT- and DWT-based speech/non-speech detection for adverse environments
Author :
Pham, Thuy V. ; Kubin, Gernot
Author_Institution :
Electron. & Telecommun. Eng. Dept., Danang Univ. of Technol., Danang, Vietnam
Abstract :
The goal of this paper is to evaluate the wavelet/frequency-based voice activity detection (VAD) algorithms under harsh conditions. A new frequency-based speech classifier has been developed based on a single subband distance feature in cooperating with adaptive percentile filter. Experimental results in clean, noisy and reverberant environments are provided. Results show that: (i) the group of algorithms exploiting the subband power distance feature mostly outperforms the state-of-the-art VAD standardized for the G. 729 B, the ETSI AFE ES 202 050 in terms of classification measures; (ii) the robustness of the model-based VAD methods still holds in a completely mismatched reverberant environment.
Keywords :
discrete Fourier transforms; discrete wavelet transforms; filtering theory; signal classification; speech recognition; DFT-based nonspeech detection; DFT-based speech detection; DWT-based nonspeech detection; DWT-based speech detection; ETSI AFE ES 202 050; VAD algorithms; adaptive percentile filter; classification measures; completely mismatched reverberant environment; frequency-based speech classifier; frequency-based voice activity detection algorithms; harsh conditions; model-based VAD methods; single subband distance feature; state-of-the-art VAD; subband power distance feature; wavelet-based voice activity detection algorithms; Discrete wavelet transforms; Noise; Noise measurement; Power capacitors; Robustness; Speech; Training; adaptive percentile filter; discrete Fourier transform; discrete wavelet transform; neural network; subband decomposition; voice activity detection;
Conference_Titel :
Advanced Technologies for Communications (ATC), 2011 International Conference on
Conference_Location :
Da Nang
Print_ISBN :
978-1-4577-1206-7
DOI :
10.1109/ATC.2011.6027490