DocumentCode :
1686141
Title :
Voice activity detection based on frequency modulation of harmonics
Author :
Chung-Chien Hsu ; Tse-En Lin ; Jian-Hueng Chen ; Tai-Shih Chi
Author_Institution :
Dept. of Electr. & Comput. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
fYear :
2013
Firstpage :
6679
Lastpage :
6683
Abstract :
In this paper, we propose a voice activity detection (VAD) algorithm based on spectro-temporal modulation structures of input sounds. A multi-resolution spectro-temporal analysis framework is used to inspect prominent speech structures. By comparing with an adaptive threshold, the proposed VAD distinguishes speech from non-speech based on the energy of the frequency modulation of harmonics. Compared with three standard VADs, ITU-T G.729B, ETSI AMR1 and AMR2, our proposed VAD significantly outperforms them in non-stationary noises in terms of the receiver operating characteristic (ROC) curves and the recognition rates from a practical distributed speech recognition (DSR) system.
Keywords :
frequency modulation; sensitivity analysis; speech recognition; DSR system; ETSI AMR1; ETSI AMR2; ITU-T G.729B; ROC curves; VAD algorithm; adaptive threshold; distributed speech recognition system; harmonics frequency modulation; input sounds; multiresolution spectro-temporal analysis framework; nonstationary noises; receiver operating characteristic curves; recognition rates; spectro-temporal modulation structures; speech structures; voice activity detection algorithm; Frequency modulation; Harmonic analysis; Noise; Spectrogram; Speech; Speech recognition; frequency modulation; spectro-temporal analysis; voice activity detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638954
Filename :
6638954
Link To Document :
بازگشت