DocumentCode :
2163484
Title :
Unsupervised extraction of audio-visual objects
Author :
Casanovas, Anna Llagostera ; Vandergheynst, Pierre
Author_Institution :
Signal Process. Lab. (LTS2), Ecole Polytech. Fed. de Lausanne (EPFL), Lausanne, Switzerland
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
2284
Lastpage :
2287
Abstract :
We propose a novel method to automatically detect and extract the video modality of the sound sources that are present in a scene. For this purpose, we first assess the synchrony between the moving objects captured with a video camera and the sounds recorded by a microphone. Next, video regions presenting a high coherence with the soundtrack are automatically labelled as being part of the source. This represents the starting point for an innovative video segmentation approach, whose objective is to extract the complete audio visual object. The proposed graph-cut segmentation procedure includes an audio-visual term that links together pixels in regions with high audio-video coherence. Our approach is demonstrated on challenging sequences presenting non-stationary sound sources and distracting moving objects.
Keywords :
acoustic radiators; audio-visual systems; feature extraction; graph theory; image segmentation; microphones; motion estimation; audio-video coherence; audio-visual objects; graph-cut segmentation; microphone; moving object detection; sound sources; unsupervised extraction; video camera; video modality; video segmentation; Coherence; Color; Correlation; Face; Mouth; Pixel; Visualization; audio-visual processing; graph cuts;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5946938
Filename :
5946938
Link To Document :
بازگشت