Learning Multiple Sequence-Based Kernels for Video Concept Detection

Author

Bailer, Werner

Author_Institution

Joanneum Res., DIGITAL - Inst. for Inf. & Commun. Technol., Graz, Austria

fYear

2012

fDate

10-12 Dec. 2012

Firstpage

73

Lastpage

77

Abstract

Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these sequence-based kernels (based e.g., on the dynamic time warping or edit distance paradigms) outperform methods working on single frames for concepts with inherently dynamic features. Existing work on sequence-based kernels either uses a single type of feature or a fixed combination of the feature vectors of each frame. However, different features (e.g., visual and audio features) may be sampled at different (possibly even irregular) rates, and the optimal alignment between the sequences of features may be different. Multiple kernel learning (MKL) has been applied to similarly structured problems, and we propose MKL for combining different sequence-based kernels on different features for video concept detection. We demonstrate the advantage of the proposed method with experiments on the TRECVID 2011 Semantic Indexing data set.

Keywords

learning (artificial intelligence); sequences; video signal processing; MKL; TRECVID 2011 Semantic Indexing data set; feature vector sequences; multiple sequence-based kernel learning; video concept detection; video event detection; video segment; Feature extraction; Histograms; Kernel; Multimedia communication; Streaming media; Vectors; Visualization; feature combination; fusion; learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia (ISM), 2012 IEEE International Symposium on

Conference_Location

Irvine, CA

Print_ISBN

978-1-4673-4370-1

Type

conf

DOI

10.1109/ISM.2012.22

Filename

6424634