DocumentCode :
253652
Title :
Multi-view Super Vector for Action Recognition
Author :
Zhuowei Cai ; Limin Wang ; Xiaojiang Peng ; Yu Qiao
Author_Institution :
Shenzhen Key Lab. of Comput. Vision & Pattern Recognition, Shenzhen Inst. of Adv. Technol., Shenzhen, China
fYear :
2014
fDate :
23-28 June 2014
Firstpage :
596
Lastpage :
603
Abstract :
Images and videos are often characterized by multiple types of local descriptors such as SIFT, HOG and HOF, each of which describes certain aspects of object feature. Recognition systems benefit from fusing multiple types of these descriptors. Two widely applied fusion pipelines are descriptor concatenation and kernel average. The first one is effective when different descriptors are strongly correlated, while the second one is probably better when descriptors are relatively independent. In practice, however, different descriptors are neither fully independent nor fully correlated, and previous fusion methods may not be satisfying. In this paper, we propose a new global representation, Multi-View Super Vector (MVSV), which is composed of relatively independent components derived from a pair of descriptors. Kernel average is then applied on these components to produce recognition result. To obtain MVSV, we develop a generative mixture model of probabilistic canonical correlation analyzers (M-PCCA), and utilize the hidden factors and gradient vectors of M-PCCA to construct MVSV for video representation. Experiments on video based action recognition tasks show that MVSV achieves promising results, and outperforms FV and VLAD with descriptor concatenation or kernel average fusion strategy.
Keywords :
feature extraction; image motion analysis; image representation; object recognition; statistical analysis; video signal processing; HOF descriptors; HOG descriptors; M-PCCA; MVSV representation; SIFT descriptors; descriptor concatenation; fusion pipelines; generative mixture model of probabilistic canonical correlation analyzers; histogram-of-oriented gradients; kernel average; multiview super vector; object feature; scale invariant feature transforms; video based action recognition tasks; Accuracy; Correlation; Encoding; Kernel; Probabilistic logic; Vectors; Videos; action recognition; canonical correlation analysis; mixture model; multi-view;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on
Conference_Location :
Columbus, OH
Type :
conf
DOI :
10.1109/CVPR.2014.83
Filename :
6909477
Link To Document :
بازگشت