DocumentCode
2401748
Title
Action snippets: How many frames does human action recognition require?
Author
Schindler, Konrad ; Van Gool, Luc
Author_Institution
BIWI, ETH Zurich, Zurich
fYear
2008
fDate
23-28 June 2008
Firstpage
1
Lastpage
8
Abstract
Visual recognition of human actions in video clips has been an active field of research in recent years. However, most published methods either analyse an entire video and assign it a single action label, or use relatively large look-ahead to classify each frame. Contrary to these strategies, human vision proves that simple actions can be recognised almost instantaneously. In this paper, we present a system for action recognition from very short sequences (ldquosnippetsrdquo) of 1-10 frames, and systematically evaluate it on standard data sets. It turns out that even local shape and optic flow for a single frame are enough to achieve ap90% correct recognitions, and snippets of 5-7 frames (0.3-0.5 seconds of video) are enough to achieve a performance similar to the one obtainable with the entire video sequence.
Keywords
image classification; image motion analysis; image recognition; image sequences; video signal processing; action recognition; action snippets; human action recognition; human vision; video clips; visual recognition; Feature extraction; Humans; Image motion analysis; Layout; Legged locomotion; Shape; Surveillance; Video sequences; Visual databases; Voting;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on
Conference_Location
Anchorage, AK
ISSN
1063-6919
Print_ISBN
978-1-4244-2242-5
Electronic_ISBN
1063-6919
Type
conf
DOI
10.1109/CVPR.2008.4587730
Filename
4587730
Link To Document