DocumentCode :
1679223
Title :
3D pooling on local space-time features for human action recognition
Author :
Hadibarhaghtalab, Najme ; Azimifar, Zohreh
Author_Institution :
Sch. of Comput. & Electr. Eng., Shiraz Univ., Shiraz, Iran
fYear :
2013
Firstpage :
266
Lastpage :
269
Abstract :
Successful approaches use local space-time features for human action recognition task including hand designed features or learned features. However these methods need a wise technique to encode local features to make a global representation for video. For this, some methods use K-means vector quantization to histogram each video as a bag of word. Pooling is a way used for global representation of an image. This method pools the local image feature over some image neighborhood. In this paper we extend pooling method called 3D pooling for global representation of video. 3D pooling represents each video by concatenating pooled feature vectors achieved from 8 equal regions of video. We also applied stacked convolutional ISA as local feature extractor. We evaluated our method on KTH data set and got our best result using max pooling. It improves the performance of highly demanded earlier methods.
Keywords :
feature extraction; image representation; vector quantisation; video signal processing; 3D pooling; K-means vector quantization; bag of word; global representation; hand designed features; human action recognition; image neighborhood; local feature extractor; local image feature; local space-time features; video; Accuracy; Feature extraction; Pipelines; Support vector machine classification; Three-dimensional displays; Vectors; action recognition; independent subspace analysis(ISA); local feature; pooling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Vision and Image Processing (MVIP), 2013 8th Iranian Conference on
Conference_Location :
Zanjan
ISSN :
2166-6776
Print_ISBN :
978-1-4673-6182-8
Type :
conf
DOI :
10.1109/IranianMVIP.2013.6779992
Filename :
6779992
Link To Document :
بازگشت