DocumentCode
6221
Title
Learning Discriminative Key Poses for Action Recognition
Author
Li Liu ; Ling Shao ; Xiantong Zhen ; Xuelong Li
Author_Institution
Dept. of Electron. & Electr. Eng., Univ. of Sheffield, Sheffield, UK
Volume
43
Issue
6
fYear
2013
fDate
Dec. 2013
Firstpage
1860
Lastpage
1870
Abstract
In this paper, we present a new approach for human action recognition based on key-pose selection and representation. Poses in video frames are described by the proposed extensive pyramidal features (EPFs), which include the Gabor, Gaussian, and wavelet pyramids. These features are able to encode the orientation, intensity, and contour information and therefore provide an informative representation of human poses. Due to the fact that not all poses in a sequence are discriminative and representative, we further utilize the AdaBoost algorithm to learn a subset of discriminative poses. Given the boosted poses for each video sequence, a new classifier named weighted local naive Bayes nearest neighbor is proposed for the final action classification, which is demonstrated to be more accurate and robust than other classifiers, e.g., support vector machine (SVM) and naive Bayes nearest neighbor. The proposed method is systematically evaluated on the KTH data set, the Weizmann data set, the multiview IXMAS data set, and the challenging HMDB51 data set. Experimental results manifest that our method outperforms the state-of-the-art techniques in terms of recognition rate.
Keywords
feature extraction; image classification; image representation; image sequences; learning (artificial intelligence); pose estimation; video signal processing; AdaBoost algorithm; EPF; Gabor pyramid; Gaussian pyramid; HMDB51 data set; KTH data set; SVM; Weizmann data set; action classification; contour information; discriminative key pose learning; extensive pyramidal features; human action recognition; human pose representation; intensity information; key-pose representation; key-pose selection; multiview IXMAS data set; orientation information; recognition rate; support vector machine; video frames; video sequence; wavelet pyramid; weighted local naive Bayes nearest neighbor classifier; Feature extraction; Humans; Laplace equations; Robustness; Spatiotemporal phenomena; Support vector machines; Video sequences; AdaBoost; computer vision; extensive pyramidal features (EPFs); human action recognition; pose selection; weighted local naive Bayes nearest neighbor (WLNBNN) classifier;
fLanguage
English
Journal_Title
Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
2168-2267
Type
jour
DOI
10.1109/TSMCB.2012.2231959
Filename
6409441
Link To Document