Title :
Joint action recognition and pose estimation from video
Author :
Bruce Xiaohan Nie;Caiming Xiong;Song-Chun Zhu
Author_Institution :
Center for Vision, Cognition, Learning and Art, University of California, Los Angeles, USA
fDate :
6/1/2015 12:00:00 AM
Abstract :
Action recognition and pose estimation from video are closely related tasks for understanding human motion, most methods, however, learn separate models and combine them sequentially. In this paper, we propose a framework to integrate training and testing of the two tasks. A spatial-temporal And-Or graph model is introduced to represent action at three scales. Specifically the action is decomposed into poses which are further divided to mid-level ST-parts and then parts. The hierarchical structure of our model captures the geometric and appearance variations of pose at each frame and lateral connections between ST-parts at adjacent frames capture the action-specific motion information. The model parameters for three scales are learned discriminatively, and action labels and poses are efficiently inferred by dynamic programming. Experiments demonstrate that our approach achieves state-of-art accuracy in action recognition while also improving pose estimation.
Keywords :
"Joints","Feature extraction","Training","Hidden Markov models","Graphical models","Trajectory"
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
Electronic_ISBN :
1063-6919
DOI :
10.1109/CVPR.2015.7298734