Title :
Monte Carlo Tree Search for Scheduling Activity Recognition
Author :
Amer, Moh R. ; Todorovic, Sinisa ; Fern, Alan ; Song-Chun Zhu
Author_Institution :
Oregon State Univ., Corvallis, OR, USA
Abstract :
This paper addresses recognition of human activities with stochastic structure, characterized by variable space-time arrangements of primitive actions, and conducted by a variable number of actors. Our approach classifies the activity of interest as well as identifies the relevant foreground in the video. Each activity representation is considered as a mixture distribution of BoWs captured by a Sum-Product Network (SPN). In our approach, SPN represents a linear mixture of many bags-of-words (BoWs) where each BoW represents an important foreground part of the activity. This mixture distribution is efficiently computed by organizing the BoWs in a hierarchy, where children BoWs are nested within parent BoWs. SPN allows us to model this mixture since it consists of terminal nodes representing BoWs, product nodes, and sum nodes organized in a number of layers. The products are aimed at encoding particular configurations of primitive actions, and the sums serve to capture their alternative configurations. SPN inference amounts to parsing the SPN graph, which yields the most probable explanation (MPE) of the video foreground. SPN inference has linear complexity in the number of nodes, under fairly general conditions, enabling fast and scalable recognition. The connectivity of SPN and the parameters of BoW distributions are learned under weak supervision using a variational EM algorithm. For our evaluation, we have compiled and annotated a new Volleyball dataset. Our classification accuracy and localization results are superior to those of the state of the art on current benchmarks as well as our Volleyball datasets.
Keywords :
Monte Carlo methods; object recognition; tree searching; BoW distributions; MPE; Monte Carlo tree search; SPN; SPN graph; Volleyball dataset; bags-of-words; human activity recognition; most probable explanation; primitive actions; scheduling activity recognition; sum-product network; variable space-time arrangements; variational EM algorithm; video foreground; Context; Detectors; Grammar; Monte Carlo methods; Planning; Switches; Training; Activity Recogition; And-Or Graphs; Event Analysis; Stochastic Grammars; Video Parsing;
Conference_Titel :
Computer Vision (ICCV), 2013 IEEE International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/ICCV.2013.171