Title :
Comparing Audio and Video Segmentations for Music Videos Indexing
Author :
Gillet, Olivier ; Richard, Gaël
Author_Institution :
GET, Telecom Paris
Abstract :
Music videos are good examples of multimedia documents in which the structures of the audio and video streams are highly correlated. This paper presents a system that matches these structures and extracts audio-visual correlation measures. The audio and video streams are independently segmented at two-levels: shots (sections for audio) and events. Audio segmentation is performed at the event level by detecting onsets, and at the section level by a novelty detection algorithm identifying instrumentation changes. Video segmentation is performed at the event level by detecting changes in the motion intensity descriptor, and at the shot level by using a classical histogram-based shot detection algorithm. Audio-visual correlation measures are computed on the extracted structures. Possible applications include audio/video stream resynchronization, video retrieval from audio content, or classification of music videos by genre
Keywords :
audio signal processing; audio-visual systems; image classification; image segmentation; indexing; synchronisation; video signal processing; video streaming; audio segmentations; audio-video stream resynchronization; audio-visual correlation measures; detection algorithm; histogram-based shot detection algorithm; motion intensity descriptor; multimedia documents; music videos classification; music videos indexing; video retrieval; video segmentations; Content based retrieval; Detection algorithms; Event detection; Gunshot detection systems; Indexing; Instruments; Multiple signal classification; Music information retrieval; Streaming media; TV broadcasting;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1661202