مرکز منطقه ای اطلاع رساني علوم و فناوري - Enhancing human action recognition through spatio-temporal feature learning and semantic rules

DocumentCode :

695154

Title :

Enhancing human action recognition through spatio-temporal feature learning and semantic rules

Author :

Ramirez-Amaro, Karinne ; Eun-Sol Kim ; Jiseob Kim ; Byoung-Tak Zhang ; Beetz, Michael ; Cheng, Gordon

Author_Institution :

Fac. of Electr. Eng., Tech. Univ. of Munich, Munich, Germany

fYear :

2013

fDate :

15-17 Oct. 2013

Firstpage :

456

Lastpage :

461

Abstract :

In this paper, we present a two-stage framework that deal with the problem of automatically extract human activities from videos. First, for action recognition we employ an unsupervised state-of-the-art learning algorithm based on Independent Subspace Analysis (ISA). This learning algorithm extracts spatio-temporal features directly from video data and it is computationally more efficient and robust than other unsupervised methods. Nevertheless, when applying this one-stage state-of-the-art action recognition technique on the observations of human everyday activities, it can only reach an accuracy rate of approximately 25%. Hence, we propose to enhance this process with a second stage, which define a new method to automatically generate semantic rules that can reason about human activities. The obtained semantic rules enhance the human activity recognition by reducing the complexity of the perception system and they allow the possibility of domain change, which can great improve the synthesis of robot behaviors. The proposed method was evaluated under two complex and challenging scenarios: making a pancake and making a sandwich. The difficulty of these scenarios is that they contain finer and more complex activities than the well known data sets (Hollywood2, KTH, etc). The results show benefits of two stages method, the accuracy of action recognition was significantly improved compared to a single-stage method (above 87% compared to human expert). This indicates the improvement of the framework using the reasoning engine for the automatic extraction of human activities from observations, thus, providing a rich mechanism for transferring a wide range of human skills to humanoid robots.

Keywords :

gesture recognition; humanoid robots; image motion analysis; robot vision; unsupervised learning; video signal processing; human action recognition; human activity recognition; humanoid robot; independent subspace analysis; reasoning engine; robot behavior; semantic rules; spatio-temporal feature learning; unsupervised state-of-the-art learning algorithm; video extraction; Accuracy; Algorithm design and analysis; Cameras; Feature extraction; Testing; Training; Videos;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Humanoid Robots (Humanoids), 2013 13th IEEE-RAS International Conference on

Conference_Location :

Atlanta, GA

ISSN :

2164-0572

Print_ISBN :

978-1-4799-2617-6

Type :

conf

DOI :

10.1109/HUMANOIDS.2013.7030014

Filename :

7030014

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=695154