DocumentCode :
2115398
Title :
SAVE: A framework for semantic annotation of visual events
Author :
Lee, Mun Wai ; Hakeem, Asaad ; Haering, Niels ; Zhu, Song-Chun
Author_Institution :
ObjectVideo, Reston, VA
fYear :
2008
fDate :
23-28 June 2008
Firstpage :
1
Lastpage :
8
Abstract :
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data mining. The method involves identifying objects in the scene, describing their inter-relations, detecting events of interest, and representing them semantically in a human readable and query-able format. The SAVE framework is composed of three main components. The first component is an image parsing engine that performs scene content extraction using bottom-up image analysis and a stochastic attribute image grammar, where we define a visual vocabulary from pixels, primitives, parts, objects and scenes, and specify their spatio-temporal or compositional relations; and a bottom-up top-down strategy is used for inference. The second component is an event inference engine, where the video event markup language (VEML) is adopted for semantic representation, and a grammar-based approach is used for event analysis and detection. The third component is the text generation engine that generates text report using head-driven phrase structure grammar (HPSG). The main contribution of this paper is a framework for an end-to-end system that infers visual events and annotates a large collection of videos. Experiments with maritime and urban scenes indicate the feasibility of the proposed approach.
Keywords :
Internet; content-based retrieval; data mining; video retrieval; Internet video search; SAVE; automatic semantic annotation; bottom-up image analysis; content-based video annotation; event inference engine; grammar-based approach; head-driven phrase structure grammar; image parsing engine; scene content extraction; stochastic attribute image grammar; video data mining; video event markup language; video query; video retrieval; visual events; Content based retrieval; Data mining; Event detection; Humans; Image analysis; Information retrieval; Internet; Layout; Object detection; Search engines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08. IEEE Computer Society Conference on
Conference_Location :
Anchorage, AK
ISSN :
2160-7508
Print_ISBN :
978-1-4244-2339-2
Electronic_ISBN :
2160-7508
Type :
conf
DOI :
10.1109/CVPRW.2008.4562954
Filename :
4562954
Link To Document :
بازگشت