Title :
Video scene segmentation using video and audio features
Author :
Sundaram, Hari ; Chang, Shih-Fu
Author_Institution :
Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
Abstract :
We present a novel algorithm for video scene segmentation. We model a scene as a semantically consistent chunk of audio-visual data. Central to the segmentation framework is the idea of a finite-memory model. We separately segment the audio and video data into scenes, using data in the memory. The audio segmentation algorithm determines the correlations amongst the envelopes of audio features. The video segmentation algorithm determines the correlations amongst shot key-frames. The scene boundaries in both cases are determined using local correlation minima. Then, we fuse the resulting segments using a nearest neighbor algorithm that is further refined using a time-alignment distribution derived from the ground truth. The algorithm was tested on a difficult data set; the first hour of a commercial film with good results. It achieves a scene segmentation accuracy of 84%
Keywords :
audio signal processing; image segmentation; multimedia systems; video signal processing; audio data segmentation; audio-visual data; commercial film; finite-memory model; local correlation minima; nearest neighbor algorithm; shot key-frames; time-alignment distribution; video scene segmentation; Event detection; Explosions; Fuses; Image segmentation; Layout; Music; Navigation; Nearest neighbor searches; Speech; Testing;
Conference_Titel :
Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on
Conference_Location :
New York, NY
Print_ISBN :
0-7803-6536-4
DOI :
10.1109/ICME.2000.871563