مرکز منطقه ای اطلاع رساني علوم و فناوري - A selective spatio-temporal interest point detector for human action recognition in complex scenes

DocumentCode :

2957800

Title :

A selective spatio-temporal interest point detector for human action recognition in complex scenes

Author :

Chakraborty, Bhaskar ; Holte, Michael B. ; Moeslund, Thomas B. ; Gonzalez, Jordi ; Roca, F. Xavier

Author_Institution :

Comput. Vision Center, Univ. Autonoma de Barcelona, Barcelona, Spain

fYear :

2011

fDate :

6-13 Nov. 2011

Firstpage :

1776

Lastpage :

1783

Abstract :

Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.

Keywords :

image classification; object recognition; support vector machines; vocabulary; STIP detection; bag-of-visual words model; complex scenes; human action recognition; local N-jet features; local constraints; local descriptor-based recognition strategies; selective spatio-temporal interest point detector; spatial pyramid technique; support vector machine classifier; surround suppression; temporal constraints; visual-words vocabulary; vocabulary building strategy; vocabulary compression techniques; Buildings; Detectors; Feature extraction; Humans; Support vector machines; Vocabulary; YouTube;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Vision (ICCV), 2011 IEEE International Conference on

Conference_Location :

Barcelona

ISSN :

1550-5499

Print_ISBN :

978-1-4577-1101-5

Type :

conf

DOI :

10.1109/ICCV.2011.6126443

Filename :

6126443

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2957800