مرکز منطقه ای اطلاع رساني علوم و فناوري - Recognizing realistic actions from videos “in the wild”

DocumentCode :

3006228

Title :

Recognizing realistic actions from videos “in the wild”

Author :

Jingen Liu ; Jiebo Luo ; Shah, Mubarak

Author_Institution :

Comput. Vision Lab., Univ. of Central Florida, Orlando, FL, USA

fYear :

2009

fDate :

20-25 June 2009

Firstpage :

1996

Lastpage :

2003

Abstract :

In this paper, we present a systematic framework for recognizing realistic actions from videos “in the wild”. Such unconstrained videos are abundant in personal collections as well as on the Web. Recognizing action from such videos has not been addressed extensively, primarily due to the tremendous variations that result from camera motion, background clutter, changes in object appearance, and scale, etc. The main challenge is how to extract reliable and informative features from the unconstrained videos. We extract both motion and static features from the videos. Since the raw features of both types are dense yet noisy, we propose strategies to prune these features. We use motion statistics to acquire stable motion features and clean static features. Furthermore, PageRank is used to mine the most informative static features. In order to further construct compact yet discriminative visual vocabularies, a divisive information-theoretic algorithm is employed to group semantically related features. Finally, AdaBoost is chosen to integrate all the heterogeneous yet complementary features for recognition. We have tested the framework on the KTH dataset and our own dataset consisting of 11 categories of actions collected from YouTube and personal videos, and have obtained impressive results for action recognition and action localization.

Keywords :

image motion analysis; video signal processing; AdaBoost; KTH dataset; PageRank; YouTube; action localization; feature extraction; information-theoretic algorithm; informative static features; motion features; motion statistics; personal videos; realistic action recognition; unconstrained videos; visual vocabularies; Cameras; Computer vision; Feature extraction; Humans; Motion pictures; Shape; Spatiotemporal phenomena; Videos; Vocabulary; YouTube;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on

Conference_Location :

Miami, FL

ISSN :

1063-6919

Print_ISBN :

978-1-4244-3992-8

Type :

conf

DOI :

10.1109/CVPR.2009.5206744

Filename :

5206744

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3006228