Title :
Empirical Mode Decomposition VAD based on multiple sensor LRT
Author :
Petsatodis, Theodoros ; Talantzis, Fotios ; Boukis, Christos
Author_Institution :
Athens Inf. Technol., Athens, Greece
Abstract :
Voice Activity Detection (VAD) remains a challenging task given its dependence on adverse noise and reverberation conditions. The problem becomes even more difficult when the microphones used to detect speech reside far from the speaker. In this paper, an unsupervised VAD scheme is presented, based on the Empirical Mode Decomposition (EMD) analysis framework and a multiple input likelihood ratio test (LRT). The highly efficient method of EMD relies on local characteristics of time scale of the data to analyse and decompose non-stationary signals into a set of so called intrinsic mode functions (IMF). These functions are injected to the multiple input LRT scheme in order to decide upon speech presence or absence. To minimize mis-detections and enhance the performance of the hypothesis test, a computationally efficient forgetting scheme along with an adaptive threshold are also employed. Simulations, conducted in several artificial environments, illustrate that significant improvements can be expected, in terms of performance, from the proposed scheme when compared to similar VAD systems.
Keywords :
microphones; speech processing; statistical analysis; unsupervised learning; VAD; adverse noise; empirical mode decomposition analysis framework; forgetting scheme; intrinsic mode functions; microphones; multiple input likelihood ratio test; multiple sensor LRT; reverberation conditions; speech absence; speech presence; unsupervised VAD scheme; voice activity detection;
Conference_Titel :
Signal Processing (CIWSP 2013), 2013 Constantinides International Workshop on
Conference_Location :
London
Electronic_ISBN :
978-1-84919-733-5
DOI :
10.1049/ic.2013.0009