Title :
Learning the spatial semantics of manipulation actions through preposition grounding
Author :
Zampogiannis, Konstantinos ; Yezhou Yang ; Fermuller, Cornelia ; Aloimonos, Yiannis
Author_Institution :
Dept. of Comput. Sci., Univ. of Maryland, College Park, MD, USA
Abstract :
In this paper, we introduce an abstract representation for manipulation actions that is based on the evolution of the spatial relations between involved objects. Object tracking in RGBD streams enables straightforward and intuitive ways to model spatial relations in 3D space. Reasoning in 3D overcomes many of the limitations of similar previous approaches, while providing significant flexibility in the desired level of abstraction. At each frame of a manipulation video, we evaluate a number of spatial predicates for all object pairs and treat the resulting set of sequences (Predicate Vector Sequences, PVS) as an action descriptor. As part of our representation, we introduce a symmetric, time-normalized pairwise distance measure that relies on finding an optimal object correspondence between two actions. We experimentally evaluate the method on the classification of various manipulation actions in video, performed at different speeds and timings and involving different objects. The results demonstrate that the proposed representation is remarkably descriptive of the high-level manipulation semantics.
Keywords :
image classification; learning (artificial intelligence); object tracking; robot vision; classification; manipulation actions abstract representation; object tracking; preposition grounding; robot; spatial semantics learning; symmetric time-normalized pairwise distance measure; Cognition; Grounding; Radio frequency; Robot sensing systems; Semantics; Three-dimensional displays;
Conference_Titel :
Robotics and Automation (ICRA), 2015 IEEE International Conference on
Conference_Location :
Seattle, WA
DOI :
10.1109/ICRA.2015.7139371