DocumentCode :
591774
Title :
Robust voice activity detection using empirical mode decomposition and modulation spectrum analysis
Author :
Kanai, Yasukazu ; Unoki, Masashi
Author_Institution :
Sch. of Inf. Sci., Japan Adv. Inst. of Sci. & Technol., Nomi, Japan
fYear :
2012
fDate :
5-8 Dec. 2012
Firstpage :
400
Lastpage :
404
Abstract :
Voice activity detection (VAD) is used to detect speech/non-speech periods in observed signals. However, the current VAD technique has a serious problem in that the accuracy of detection of speech periods drastically reduces if it is used for noisy speech and/or for mixtures of speech/non-speech such as those in music and environmental sounds. Thus, VAD needs to be robust to enable speech periods to be accurately detected in these situations. This paper proposes an approach to robust VAD using empirical mode decomposition (EMD) and modulation spectrum analysis (MSA) to resolve these problems. This is proposed to reducing background noise by using EMD without estimating SNR (noise conditions), and then to determining speech/non-speech periods by using MSA. Three experiments on VAD in real environments were conducted to evaluate the proposed method by comparing it with typical methods (Otsu´s and G.729B). The results demonstrated that the proposed method could accurately detect speech periods more accurately than the typical methods.
Keywords :
signal detection; speech processing; EMD; MSA; VAD technique; empirical mode decomposition; modulation spectrum analysis; noisy speech; nonspeech period detection; robust voice activity detection; Databases; Modulation; Noise measurement; Robustness; Signal to noise ratio; Speech; empirical mode decomposition; modulation spectrum analysis; voice activity detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
Type :
conf
DOI :
10.1109/ISCSLP.2012.6423519
Filename :
6423519
Link To Document :
بازگشت