مرکز منطقه ای اطلاع رساني علوم و فناوري - Types, Locations, and Scales from Cluttered Natural Video and Actions

DocumentCode :

3609977

Title :

Types, Locations, and Scales from Cluttered Natural Video and Actions

Author :

Xiaoying Song ; Wenqiang Zhang ; Juyang Weng

Author_Institution :

Sch. of Comput. Sci., Fudan Univ., Shanghai, China

Volume :

Issue :

fYear :

2015

Firstpage :

273

Lastpage :

286

Abstract :

We model the autonomous development of brain-inspired circuits through two modalities-video stream and action stream that are synchronized in time. We assume that such multimodal streams are available to a baby through inborn reflexes, self-supervision, and caretaker´s supervision, when the baby interacts with the real world. By autonomous development, we mean that not only that the internal (inside the “skull”) self-organization is fully autonomous, but the developmental program (DP) that regulates the computation of the network is also task nonspecific. In this work, the task-nonspecificity is reflected by the fact that the actions associated with an attended object in a cluttered, natural, and dynamic scene is taught after the DP is finished and the “life” has begun. The actions correspond to neuronal firing patterns representing object type, object location and object scale, but learning is directly from unsegmented cluttered scenes. Along the line of where-what networks (WWN), this is the first one that explicitly models multiple “brain” areas-each for a different range of object scales. Among experiments, large natural video experiments were conducted. To show the power of automatic attention in unknown cluttered backgrounds, the last experimental group demonstrated disjoint tests in the presence of large within-class variations (object 3-D-rotations in very different unknown backgrounds), but small between-class variations (small object patches in large similar and different unknown backgrounds), in contrast with global classification tests such as ImageNet and Atari Games.

Keywords :

object detection; object recognition; video signal processing; 3D rotations; Atari games; DP; ImageNet; WWN; action stream; automatic attention; autonomous development; brain areas; brain-inspired circuits; caretaker supervision; cluttered backgrounds; cluttered natural video; developmental program; dynamic scene; global classification tests; modalities; multimodal streams; natural video experiments; neuronal firing patterns; object location; object scale; self-supervision; task-nonspecificity; unsegmented cluttered scenes; video stream; where-what networks; within-class variations; Brain modeling; Computational modeling; Computer architecture; Neurons; Object recognition; Robot sensing systems; Attention; brain informed; cluttered scene; feature development; invariance; multimodal; neural networks; object detection; object recognition;

fLanguage :

English

Journal_Title :

Autonomous Mental Development, IEEE Transactions on

Publisher :

ieee

ISSN :

1943-0604

Type :

jour

DOI :

10.1109/TAMD.2015.2478377

Filename :

7322214

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3609977