Title :
A scalable framework for joint clustering and synchronizing multi-camera videos
Author :
Bagri, Ashish ; Thudor, Franck ; Ozerov, Alexey ; Hellier, P.
Author_Institution :
Technicolor, Rennes, France
Abstract :
This paper describes a method to cluster and synchronize large scale audio-video sequences recorded by multiple users during an event. The proposed method is designed to jointly cluster audio content and synchronize sequences in each cluster to create a multi-view presentation of the event. The method is roughly based on cross-correlation of local audio features. In this paper, three main contributions are presented to obtain a scalable and accurate framework. First, a salient representation of features is used to reduce the computation complexity while maintaining high performance. Second, an intermediate clustering step is introduced to limit the number of comparisons required. Third, a voting approach is proposed to avoid tuning thresholds for cross-correlation. This framework was tested on 164 YouTube concert videos and results demonstrated the efficiency of the method with a correct clustering of 98.8% of the sequences.
Keywords :
audio signal processing; computational complexity; feature extraction; image representation; image sequences; pattern clustering; synchronisation; video signal processing; YouTube concert videos; audio content clustering; computational complexity; feature extraction; intermediate clustering step; joint multicamera video clustering; large scale audio-video sequences; local audio feature cross-correlation; multicamera video synchronization; multiview presentation; tuning thresholds; voting approach; Complexity theory; Databases; Joints; Mel frequency cepstral coefficient; Synchronization; Tuning; Videos; Feature extraction; clustering methods; cross-correlation; scalability; synchronization;
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European
Conference_Location :
Marrakech