مرکز منطقه ای اطلاع رساني علوم و فناوري - Recognizing human actions using multiple features

DocumentCode :

2398378

Title :

Recognizing human actions using multiple features

Author :

Liu, Jingen ; Ali, Saad ; Shah, Mubarak

Author_Institution :

Comput. Vision Lab., Univ. of Central Florida, Orlando, FL

fYear :

2008

fDate :

23-28 June 2008

Firstpage :

Lastpage :

Abstract :

In this paper, we propose a framework that fuses multiple features for improved action recognition in videos. The fusion of multiple features is important for recognizing actions as often a single feature based representation is not enough to capture the imaging variations (view-point, illumination etc.) and attributes of individuals (size, age, gender etc.). Hence, we use two types of features: i) a quantized vocabulary of local spatio-temporal (ST) volumes (or cuboids), and ii) a quantized vocabulary of spin-images, which aims to capture the shape deformation of the actor by considering actions as 3D objects (x, y, t). To optimally combine these features, we treat different features as nodes in a graph, where weighted edges between the nodes represent the strength of the relationship between entities. The graph is then embedded into a k-dimensional space subject to the criteria that similar nodes have Euclidian coordinates which are closer to each other. This is achieved by converting this constraint into a minimization problem whose solution is the eigenvectors of the graph Laplacian matrix. This procedure is known as Fiedler embedding. The performance of the proposed framework is tested on publicly available data sets. The results demonstrate that fusion of multiple features helps in achieving improved performance, and allows retrieval of meaningful features and videos from the embedding space.

Keywords :

eigenvalues and eigenfunctions; feature extraction; graph theory; image recognition; image representation; matrix algebra; video retrieval; Euclidian coordinates; Fiedler embedding; graph Laplacian matrix; human action recognition; imaging variations; local spatiotemporal volumes; multiple features; quantized vocabulary; single feature based representation; video recognition; video retrieval; Fuses; Humans; Image recognition; Laplace equations; Lighting; Matrix converters; Shape; Testing; Videos; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on

Conference_Location :

Anchorage, AK

ISSN :

1063-6919

Print_ISBN :

978-1-4244-2242-5

Electronic_ISBN :

1063-6919

Type :

conf

DOI :

10.1109/CVPR.2008.4587527

Filename :

4587527

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2398378