Title :
Blind audiovisual separation based on redundant representations
Author :
Casanovas, Anna Llagostera ; Monaci, Gianluca ; Vandergheynst, Pierre ; Gribonval, Rémi
Author_Institution :
Ecole Polytech. Fed. de Lausanne (EPFL), Signal Process. Inst., Lausanne
fDate :
March 31 2008-April 4 2008
Abstract :
In this work we present a method to perform a complete audiovisual source separation without need of previous information. This method is based on the assumption that sounds are caused by moving structures. Thus, an efficient representation of audio and video sequences allows to build relationships between synchronous structures on both modalities. A robust clustering algorithm groups video structures exhibiting strong correlations with the audio so that sources are counted and located in the image. Using such information and exploiting audio-video correlation, the audio sources activity is determined. Next, spectral Gaussian Mixture Models (GMMs) are learnt in time slots with only one source active so that it is possible to separate them in case of an audio mixture. Audio source separation performances are rigorously evaluated, clearly showing that the proposed algorithm performs efficiently and robustly.
Keywords :
Gaussian processes; audio signal processing; blind source separation; correlation methods; image sequences; signal representation; video signal processing; audio sequences; audio-video correlation; blind audiovisual source separation; robust clustering algorithm; spectral Gaussian mixture models; synchronous structures; video sequences; Image segmentation; Loudspeakers; Matching pursuit algorithms; Microphones; Performance evaluation; Robustness; Signal processing algorithms; Source separation; Speech; Stress; Audiovisual processing; GMM; blind source separation; sparse signal representation;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4517991