DocumentCode :
111779
Title :
Enhancing Video Event Recognition Using Automatically Constructed Semantic-Visual Knowledge Base
Author :
Xishan Zhang ; Yang Yang ; Yongdong Zhang ; Huanbo Luan ; Jintao Li ; Hanwang Zhang ; Tat-Seng Chua
Author_Institution :
Key Lab. of Intell. Inf. Process., Inst. of Comput. Technol., Beijing, China
Volume :
17
Issue :
9
fYear :
2015
fDate :
Sept. 2015
Firstpage :
1562
Lastpage :
1575
Abstract :
The task of recognizing events from video has attracted a lot of attention in recent years. However, due to the complex nature of user-defined events, the use of purely audio- visual content analysis without domain knowledge has been found to be grossly inadequate. In this paper, we propose to construct a semantic-visual knowledge base to encode the rich event-centric concepts and their relationships from the well- established lexical databases, including FrameNet, as well as the concept-specific visual knowledge from ImageNet. Based on this semantic-visual knowledge bases, we design an effective system for video event recognition. Specifically, in order to narrow the semantic gap between the high-level complex events and low-level visual representations, we utilize the event-centric semantic concepts encoded in the knowledge base as the intermediate-level event representation, which offers both human-perceivable and machine-interpretable semantic clues for event recognition. In addition, in order to leverage the abundant ImageNet images, we propose a robust transfer learning model to learn the noise- resistant concept classifiers for videos. Extensive experiments on various real-world video datasets demonstrate the superiority of our proposed system as compared to the state-of-the-art approaches.
Keywords :
image classification; knowledge based systems; learning (artificial intelligence); video signal processing; FrameNet; ImageNet images; audio-visual content analysis; automatically constructed semantic-visual knowledge base; concept-specific visual knowledge; event-centric semantic concept encoding; high-level complex events; human-perceivable semantic clues; intermediate-level event representation; lexical database; low-level visual representation; machine-interpretable semantic clues; multiple kernel learning algorithm; noise-resistant concept classifier; robust transfer learning model; semantic gap; user-defined events; video event recognition; Feature extraction; Knowledge based systems; Multimedia communication; Semantics; Streaming media; Vehicles; Visualization; Concept detection; event recognition; knowledge base;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2015.2449660
Filename :
7132742
Link To Document :
بازگشت