Title :
Local Features and a Two-Layer Stacking Architecture for Semantic Concept Detection in Video
Author :
Markatopoulou, Foteini ; Mezaris, Vasileios ; Pittaras, Nikiforos ; Patras, Ioannis
Author_Institution :
Centre for Res. & Technol. Hellas, Inf. Technol. Inst., Thermi, Greece
Abstract :
In this paper, we deal with the problem of extending and using different local descriptors, as well as exploiting concept correlations, toward improved video semantic concept detection. We examine how the state-of-the-art binary local descriptors can facilitate concept detection, we propose color extensions of them inspired by previously proposed color extensions of scale invariant feature transform, and we show that the latter color extension paradigm is generally applicable to both binary and nonbinary local descriptors. In order to use them in conjunction with a state-of-the-art feature encoding, we compact the above color extensions using PCA and we compare two alternatives for doing this. Concerning the learning stage of concept detection, we perform a comparative study and propose an improved way of employing stacked models, which capture concept correlations, using multilabel classification algorithms in the last layer of the stack. We examine and compare the effectiveness of the above algorithms in both semantic video indexing within a large video collection and in the somewhat different problem of individual video annotation with semantic concepts, on the extensive video data set of the 2013 TRECVID Semantic Indexing Task. Several conclusions are drawn from these experiments on how to improve the video semantic concept detection.
Keywords :
feature extraction; image classification; image coding; image colour analysis; indexing; learning (artificial intelligence); object detection; principal component analysis; transforms; video signal processing; 2013 TRECVID Semantic Indexing Task; PCA; binary local descriptors; color extension; concept correlation; feature encoding; learning stage; local features; multilabel classification algorithm; nonbinary local descriptor; scale invariant feature transform; semantic concept detection; semantic video indexing; stacked model; two-layer stacking architecture; video annotation; video collection; Correlation; Detectors; Feature extraction; Image color analysis; Semantics; Stacking; Vectors; Content analysis and indexing; binary descriptors; concept correlation; multi-label classification; semantic concept detection; semantic video annotation; stacking; video feature extraction;
Journal_Title :
Emerging Topics in Computing, IEEE Transactions on
DOI :
10.1109/TETC.2015.2418714