DocumentCode :
1196759
Title :
On the Correlation of Automatic Audio and Visual Segmentations of Music Videos
Author :
Gillet, Olivier ; Essid, Slim ; Richard, Gael
Author_Institution :
GET-Telecom, LTCI-CNRS, Paris
Volume :
17
Issue :
3
fYear :
2007
fDate :
3/1/2007 12:00:00 AM
Firstpage :
347
Lastpage :
355
Abstract :
The study of the associations between audio and video content has numerous important applications in the fields of information retrieval and multimedia content authoring. In this work, we focus on music videos which exhibit a broad range of structural and semantic relationships between the music and the video content. To identify such relationships, a two-level automatic structuring of the music and the video is achieved separately. Note onsets are detected from the music signal, along with section changes. The latter is achieved by a novel algorithm which makes use of feature selection and statistical novelty detection approaches based on kernel methods. The video stream is independently segmented to detect changes in motion activity, as well as shot boundaries. Based on this two-level segmentation of both streams, four audio-visual correlation measures are computed. The usefulness of these correlation measures is illustrated by a query by video experiment on a 100 music video database, which also exhibits interesting genre dependencies
Keywords :
audio signal processing; content-based retrieval; feature extraction; image segmentation; statistical analysis; video signal processing; video streaming; audio-visual correlation; automatic audio segmentations; feature selection; information retrieval; kernel methods; multimedia content authoring; music signal; music videos; semantic relationships; statistical novelty detection; video stream; visual segmentations; Content based retrieval; Digital multimedia broadcasting; Gunshot detection systems; Indexing; Multimedia communication; Multiple signal classification; Music information retrieval; Streaming media; TV broadcasting; Videos; Audio segmentation; cross-modal queries; information retrieval; multimedia indexing; multimodal processing; music videos; novelty detection;
fLanguage :
English
Journal_Title :
Circuits and Systems for Video Technology, IEEE Transactions on
Publisher :
ieee
ISSN :
1051-8215
Type :
jour
DOI :
10.1109/TCSVT.2007.890831
Filename :
4118238
Link To Document :
بازگشت