DocumentCode
1764141
Title
Interactive Phrases: Semantic Descriptionsfor Human Interaction Recognition
Author
Yu Kong ; Yunde Jia ; Yun Fu
Author_Institution
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
Volume
36
Issue
9
fYear
2014
fDate
Sept. 2014
Firstpage
1775
Lastpage
1788
Abstract
This paper addresses the problem of recognizing human interactions from videos. We propose a novel approach that recognizes human interactions by the learned high-level descriptions, interactive phrases. Interactive phrases describe motion relationships between interacting people. These phrases naturally exploit human knowledge and allow us to construct a more descriptive model for recognizing human interactions. We propose a discriminative model to encode interactive phrases based on the latent SVM formulation. Interactive phrases are treated as latent variables and are used as mid-level features. To complement manually specified interactive phrases, we also discover data-driven phrases from data in order to find potentially useful and discriminative phrases for differentiating human interactions. An information-theoretic approach is employed to learn the data-driven phrases. The interdependencies between interactive phrases are explicitly captured in the model to deal with motion ambiguity and partial occlusion in the interactions. We evaluate our method on the BIT-Interaction data set, UT-Interaction data set, and Collective Activity data set. Experimental results show that our approach achieves superior performance over previous approaches.
Keywords
gesture recognition; support vector machines; video signal processing; BIT-interaction data set; UT-interaction data set; collective activity data set; data-driven phrases; descriptive model; discriminative phrases; human interaction recognition; human knowledge; information-theoretic approach; interacting people; interactive phrase encoding; latent SVM formulation; latent variables; learned high-level descriptions; mid-level features; motion ambiguity; motion relationships; partial occlusion; specified interactive phrases; training video; Feature extraction; Hidden Markov models; Semantics; Torso; Training; Vectors; Videos; Human interaction; action recognition; latent structural SVM;
fLanguage
English
Journal_Title
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher
ieee
ISSN
0162-8828
Type
jour
DOI
10.1109/TPAMI.2014.2303090
Filename
6739171
Link To Document