DocumentCode :
3403653
Title :
Learning a hierarchy of discriminative space-time neighborhood features for human action recognition
Author :
Kovashka, Adriana ; Grauman, Kristen
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at Austin, Austin, TX, USA
fYear :
2010
fDate :
13-18 June 2010
Firstpage :
2046
Lastpage :
2053
Abstract :
Recent work shows how to use local spatio-temporal features to learn models of realistic human actions from video. However, existing methods typically rely on a predefined spatial binning of the local descriptors to impose spatial information beyond a pure “bag-of-words” model, and thus may fail to capture the most informative space-time relationships. We propose to learn the shapes of space-time feature neighborhoods that are most discriminative for a given action category. Given a set of training videos, our method first extracts local motion and appearance features, quantizes them to a visual vocabulary, and then forms candidate neighborhoods consisting of the words associated with nearby points and their orientation with respect to the central interest point. Rather than dictate a particular scaling of the spatial and temporal dimensions to determine which points are near, we show how to learn the class-specific distance functions that form the most informative configurations. Descriptors for these variable-sized neighborhoods are then recursively mapped to higher-level vocabularies, producing a hierarchy of space-time configurations at successively broader scales. Our approach yields state-of-the-art performance on the UCF Sports and KTH datasets.
Keywords :
feature extraction; humanities; image motion analysis; pose estimation; spatiotemporal phenomena; video signal processing; vocabulary; KTH dataset; UCF sport; bag-of-words model; class-specific distance function; discriminative space-time neighborhood feature; feature extraction; human action recognition; local descriptor; spatio-temporal feature; training video set; variable-sized neighborhood; visual vocabulary; Anthropometry; Computer science; Data mining; Histograms; Humans; Motion measurement; Shape measurement; Surveillance; Tracking; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
Conference_Location :
San Francisco, CA
ISSN :
1063-6919
Print_ISBN :
978-1-4244-6984-0
Type :
conf
DOI :
10.1109/CVPR.2010.5539881
Filename :
5539881
Link To Document :
بازگشت