DocumentCode :
2998440
Title :
Visual Voice Activity Detection Using Frontal versus Profile Views
Author :
Navarathna, Rajitha ; Dean, David ; Sridharan, Sridha ; Fookes, Clinton ; Lucey, Patrick
Author_Institution :
Speech, Audio, Image & Video Technol. Lab., Queensland Univ. of Technol., Brisbane, QLD, Australia
fYear :
2011
fDate :
6-8 Dec. 2011
Firstpage :
134
Lastpage :
139
Abstract :
Visual activity detection of lip movements can be used to overcome the poor performance of voice activity detection based solely in the audio domain, particularly in noisy acoustic conditions. However, most of the research conducted in visual voice activity detection (VVAD) has neglected addressing variabilities in the visual domain such as viewpoint variation. In this paper we investigate the effectiveness of the visual information from the speaker´s frontal and profile views (i.e left and right side views) for the task of VVAD. As far as we are aware, our work constitutes the first real attempt to study this problem. We describe our visual front end approach and the Gaussian mixture model (GMM) based VVAD framework, and report the experimental results using the freely available CUAVE database. The experimental results show that VVAD is indeed possible from profile views and we give a quantitative comparison of VVAD based on frontal and profile views The results presented are useful in the development of multi-modal Human Machine Interaction (HMI) using a single camera, where the speaker´s face may not always be frontal.
Keywords :
computer vision; speech recognition; CUAVE database; Gaussian mixture model; audio domain; frontal views; lip movements; multimodal human machine interaction; noisy acoustic condition; profile views; speaker face; visual domain; visual front end; visual voice activity detection; Databases; Face; Feature extraction; Mouth; Speech; Speech recognition; Visualization; Frontal-view; GMM based VAD; Profile-view; Visual Voice Activity Detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Image Computing Techniques and Applications (DICTA), 2011 International Conference on
Conference_Location :
Noosa, QLD
Print_ISBN :
978-1-4577-2006-2
Type :
conf
DOI :
10.1109/DICTA.2011.29
Filename :
6128671
Link To Document :
بازگشت