DocumentCode :
2289957
Title :
Learning deformable action templates from cluttered videos
Author :
Yao, Benjamin ; Zhu, Song-Chun
Author_Institution :
Dept. of Stat., UCLA, Los Angeles, CA, USA
fYear :
2009
fDate :
Sept. 29 2009-Oct. 2 2009
Firstpage :
1507
Lastpage :
1514
Abstract :
In this paper, we present a Deformable Action Template (DAT) model that is learnable from cluttered real-world videos with weak supervisions. In our generative model, an action template is a sequence of image templates each of which consists of a set of shape and motion primitives (Gabor wavelets and optical-flow patches) at selected orientations and locations. These primitives are allowed to slightly perturb their locations and orientations to account for spatial deformations. We use a shared pursuit algorithm to automatically discover a best set of primitives and weights by maximizing the likelihood over one or more aligned training examples. Since it is extremely hard to accurately label human actions from real-world videos, we use a three-step semi-supervised learning procedure. 1) For each human action class, a template is initialized from a labeled (one bounding-box per frame) training video. 2) The template is used to detect actions from other training videos of the same class by a dynamic space-time warping algorithm, which searches a best match between the template and target video in 5D space (x, y, scale, ttemplate and ttarget) using dynamic programming. 3) The template is updated by the shared pursuit algorithm over all aligned videos. The 2nd and 3rd steps iterate several times to arrive at an optimal action template. We tested our algorithm on a cluttered action dataset (the CMU dataset) and achieved favorable performance than. Our classification performance on the KTH dataset is also comparable to state-of-the-arts.
Keywords :
dynamic programming; image classification; image segmentation; image sequences; learning (artificial intelligence); video signal processing; aligned videos; cluttered action dataset; cluttered videos; deformable action learning; deformable action template model; dynamic programming; dynamic space time warping algorithm; image sequences; likelihood maximization; semisupervised learning; shared pursuit algorithm; Videos;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision, 2009 IEEE 12th International Conference on
Conference_Location :
Kyoto
ISSN :
1550-5499
Print_ISBN :
978-1-4244-4420-5
Electronic_ISBN :
1550-5499
Type :
conf
DOI :
10.1109/ICCV.2009.5459277
Filename :
5459277
Link To Document :
بازگشت