Title :
Unsupervised multimodal VAD using sequential hierarchy
Author :
Ahmad, Rabiah ; Raza, Syed Paymaan ; Malik, Haroon
Author_Institution :
Inf. Syst., Security & Forensics (ISSF) Lab., Univ. of Michigan, Dearborn, MI, USA
Abstract :
In speech processing systems, the performance of the Voice Activity Detector (VAD) is a bottleneck to the whole system. Traditional VADs are solely based on acoustic features. Additional modality in form of visual information is used to make robust VADs. In this paper, we propose a multimodal VAD based on decision fusion between two modalities. Visual VAD (VVAD) decision vectors are interpolated so that logical operators can be applied to both modalities. In order to avoid this interpolation, we suggest a sequential arrangement of both subsystems to achieve a multimodal VAD. The proposed method considerably reduces false alarm rates when compared with performance of standalone audio VAD (AVAD).
Keywords :
sensor fusion; signal detection; speech processing; vectors; AVAD; VVAD; audio VAD; decision fusion; false alarm rates; logical operators; speech processing systems; unsupervised multimodal VAD; visual VAD decision vectors; voice activity detector; Data mining; Feature extraction; Noise; Robustness; Speech; Vectors; Visualization;
Conference_Titel :
Computational Intelligence and Data Mining (CIDM), 2013 IEEE Symposium on
Conference_Location :
Singapore
DOI :
10.1109/CIDM.2013.6597233