مرکز منطقه ای اطلاع رساني علوم و فناوري - A temporal Bayesian model for classifying, detecting and localizing activities in video sequences

DocumentCode :

2603011

Title :

A temporal Bayesian model for classifying, detecting and localizing activities in video sequences

Author :

Malgireddy, Manavender R. ; Inwogu, Ifeoma ; Govindaraju, Venu

Author_Institution :

Univ. at Buffalo, Buffalo, NY, USA

fYear :

2012

fDate :

16-21 June 2012

Firstpage :

Lastpage :

Abstract :

We present an framework to detect and localize activities in unconstrained real-life video sequences. This is a more challenging problem as it subsumes the activity classification problem and also requires us to work with unconstrained videos. To obtain real-life data, we have focused on using the Human Motion Database (HMDB), a collection of realistic video clips. The detection and localization paradigm we introduce uses a keyword model for detecting key activities or gestures in a video sequence. This process is analogous to the use of keyword or key-phrase detection in speech processing. The method learns models for the activities-of-interest during training, so that when presented with a network of activities (a representation of video sequences) at testing, the goal is to detect the keywords in the network. Our approach for classification outperformed all the current state-of-the-art classifiers when tested on two publicly available datasets, KTH and HMDB. We also tested this paradigm for spotting gestures via a one-shot-learning approach on the CHALEARN gesture dataset and obtained very promising results. Our approach was ranked amongst the top-5 best performing techniques in the CHALEARN 2012 gesture spotting competition.

Keywords :

Bayes methods; image classification; image sequences; learning (artificial intelligence); object detection; video signal processing; CHALEARN 2012 gesture spotting competition; CHALEARN gesture dataset; HMDB dataset; KTH dataset; activities-of-interest; activity classification; activity detection; activity localization; gesture detection; human motion database; key-phrase detection; keyword detection; keyword model; model learning; one-shot-learning approach; real-life video sequences; speech processing; temporal Bayesian model; video clips; Accuracy; Computational modeling; Feature extraction; Hidden Markov models; Probabilistic logic; Video sequences; Visualization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on

Conference_Location :

Providence, RI

ISSN :

2160-7508

Print_ISBN :

978-1-4673-1611-8

Electronic_ISBN :

2160-7508

Type :

conf

DOI :

10.1109/CVPRW.2012.6239185

Filename :

6239185

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2603011