Multipe/Single-View Human Action Recognition via Part-Induced Multitask Structural Learning

Author

An-An Liu ; Yu-Ting Su ; Ping-Ping Jia ; Zan Gao ; Tong Hao ; Zhao-Xuan Yang

Author_Institution

Sch. of Electron. Inf. Eng., Tianjin Univ., Tianjin, China

Volume

45

Issue

6

fYear

2015

fDate

Jun-15

Firstpage

1194

Lastpage

1208

Abstract

This paper proposes a unified framework for multiple/single-view human action recognition. First, we propose the hierarchical partwise bag-of-words representation which encodes both local and global visual saliency based on the body structure cue. Then, we formulate the multiple/single-view human action recognition as a part-regularized multitask structural learning (MTSL) problem which has two advantages on both model learning and feature selection: 1) preserving the consistence between the body-based action classification and the part-based action classification with the complementary information among different action categories and multiple views and 2) discovering both action-specific and action-shared feature subspaces to strengthen the generalization ability of model learning. Moreover, we contribute two novel human action recognition datasets, TJU (a single-view multimodal dataset) and MV-TJU (a multiview multimodal dataset). The proposed method is validated on three kinds of challenging datasets, including two single-view RGB datasets (KTH and TJU), two well-known depth dataset (MSR action 3-D and MSR daily activity 3-D), and one novel multiview multimodal dataset (MV-TJU). The extensive experimental results show that this method can outperform the popular 2-D/3-D part model-based methods and several other competing methods for multiple/single-view human action recognition in both RGB and depth modalities. To our knowledge, this paper is the first to demonstrate the applicability of MTSL with part-based regularization on multiple/single-view human action recognition in both RGB and depth modalities.

Keywords

feature selection; image classification; image colour analysis; image motion analysis; learning (artificial intelligence); MSR action 3D; MSR daily activity 3D; MTSL problem; MV-TJU; action-shared feature; action-specific feature; body-based action classification; feature selection; hierarchical partwise bag-of-words representation; model learning; multiple-view human action recognition; multiview multimodal dataset; part-based action classification; part-based regularization; part-induced multitask structural learning; part-regularized multitask structural learning; single-view RGB datasets; single-view human action recognition; single-view multimodal dataset; Correlation; Feature extraction; Hidden Markov models; Joints; Spatiotemporal phenomena; Visualization; Multitask learning (MTL); multiview human action recognition; partwise bag-of-words; structural learning;

fLanguage

English

Journal_Title

Cybernetics, IEEE Transactions on

Publisher

ieee

ISSN

2168-2267

Type

jour

DOI

10.1109/TCYB.2014.2347057

Filename

6884839