مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-Layer Multi-Instance Learning for Video Concept Detection

DocumentCode :

1017765

Title :

Multi-Layer Multi-Instance Learning for Video Concept Detection

Author :

Gu, Zhiwei ; Mei, Tao ; Hua, Xian-Sheng ; Tang, Jinhui ; Wu, Xiuqing

Author_Institution :

Dept. of Electron. Eng. & Inf. Sci., Univ. of Sci. & Technol. of China, Hefei

Volume :

Issue :

fYear :

2008

Firstpage :

1605

Lastpage :

1616

Abstract :

This paper presents a novel learning-based method, called ldquomulti-layer multi-instance (MLMI) learning,rdquo for video concept detection. Most of existing methods have treated video as a flat data sequence and have not investigated the intrinsic hierarchy structure of the video content deeply. However, video is essentially a kind of media with ML structure. For example, a video can be represented by a hierarchical structure including, from large to small, shot, frame, and region, where each pair of contiguous layers fits the typical MI setting. We call such a ML structure and the MI relations embedded in the structure as the MLMI setting. In this paper, we systematically study both ML structure and MI relations embedded in video content by formulating video concept detection as a MLMI learning problem. Specifically, we first construct a MLMI kernel to simultaneously model such ML structure and MI relations. To deal with the ambiguity propagation problem which is introduced by weak labeling and ML structure, we then propose a regularization framework which takes hyper-bag prediction error, sublayer prediction error, inter-layer inconsistency measure, and classifier complexity into consideration. We have applied the proposed MLMI learning method to concept detection task over TRECVid 2005 development corpus, and report better performance to vector-based and the state-of-the-art MI learning methods.

Keywords :

error analysis; learning (artificial intelligence); video signal processing; TRECVid 2005; ambiguity propagation problem; data sequence; hyperbag prediction error; interlayer inconsistency measure; multilayer multiinstance learning; sublayer prediction error; video concept detection; Airplanes; Computer vision; Information analysis; Kernel; Labeling; Learning systems; Predictive models; Surges; Training data; Video equipment; Multi-layer multi-instance learning; kernel; video concept detection;

fLanguage :

English

Journal_Title :

Multimedia, IEEE Transactions on

Publisher :

ieee

ISSN :

1520-9210

Type :

jour

DOI :

10.1109/TMM.2008.2007290

Filename :

4694899

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1017765