مرکز منطقه ای اطلاع رساني علوم و فناوري - Semantic saliency using k-TR theory of visual perception

DocumentCode :

595519

Title :

Semantic saliency using k-TR theory of visual perception

Author :

Varadarajan, Karthik Mahesh ; Vincze, Markus

Author_Institution :

Tech. Univ. of Vienna, Vienna, Austria

fYear :

2012

fDate :

11-15 Nov. 2012

Firstpage :

3676

Lastpage :

3679

Abstract :

Saliency in 2D imagery has been receiving increasing attention over the last few years owing to the need to minimize computation requirements through visual search space reduction, especially in the field of domestic robotics. Saliency and pre-attention mechanisms such as the Itti-Koch model have largely been focused on multi-scale local features mimicking low level attention processes in visual system, without any regard for the semantic content of the scene and therefore any cognitive grounding in visual processing. The `k-TR´ theory presents the first attempt at a true cognitive understanding of scenes by explaining visual perception and object recognition, in terms of Recognition of Component Affordances (RBCA). The k-TR model, presents a bi-layer recognition process through a combination of local, global, semantic and affordance features. The k-TR theory provides psychophysical, neurobiological, linguistic and evolutionary studies to support the theory and explains recognition of over 250 categories of common household objects. The features used by k-TR for object representation, termed as k-TRONs are available from the publicly available Affordance Network database (AfNet). In this paper, we use the k-TRON features, in particular the 35+ affordance features, in order to incorporate semantic context into saliency models. Saliency or surprise for pre-attention is modeled in the form of affordance aberrations. By using affordance aberration features for conspicuity map generation, we show that the resulting saliency and attention points more closely resemble the salient regions or surprise regions generated by the human visual system, hence providing superior performance in comparison to the Itti framework. Furthermore, by learning of affordance affinities from test subjects, the degree of influence of each affordance aberration towards visual saliency is estimated and incorporated into the overall saliency model.

Keywords :

feature extraction; image representation; object recognition; visual perception; 2D imagery; AfNet; Itti-Koch model; RBCA; affordance aberrations; affordance features; affordance network database; bilayer recognition process; cognitive scene understanding; computation requirement minimization; conspicuity map generation; domestic robotics; evolutionary studies; global features; household object recognition; human visual system; k-TR Theory; k-TRON features; linguistic studies; neurobiological studies; object representation; preattention mechanisms; psychophysical studies; recognition of component affordances; salient regions; semantic features; semantic saliency; surprise regions; visual perception; visual processing; visual saliency; visual search space reduction; Detectors; Filtration; Humans; Search problems; Semantics; Visual perception; Visualization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Pattern Recognition (ICPR), 2012 21st International Conference on

Conference_Location :

Tsukuba

ISSN :

1051-4651

Print_ISBN :

978-1-4673-2216-4

Type :

conf

Filename :

6460962

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=595519