DocumentCode :
3609977
Title :
Types, Locations, and Scales from Cluttered Natural Video and Actions
Author :
Xiaoying Song ; Wenqiang Zhang ; Juyang Weng
Author_Institution :
Sch. of Comput. Sci., Fudan Univ., Shanghai, China
Volume :
7
Issue :
4
fYear :
2015
Firstpage :
273
Lastpage :
286
Abstract :
We model the autonomous development of brain-inspired circuits through two modalities-video stream and action stream that are synchronized in time. We assume that such multimodal streams are available to a baby through inborn reflexes, self-supervision, and caretaker´s supervision, when the baby interacts with the real world. By autonomous development, we mean that not only that the internal (inside the “skull”) self-organization is fully autonomous, but the developmental program (DP) that regulates the computation of the network is also task nonspecific. In this work, the task-nonspecificity is reflected by the fact that the actions associated with an attended object in a cluttered, natural, and dynamic scene is taught after the DP is finished and the “life” has begun. The actions correspond to neuronal firing patterns representing object type, object location and object scale, but learning is directly from unsegmented cluttered scenes. Along the line of where-what networks (WWN), this is the first one that explicitly models multiple “brain” areas-each for a different range of object scales. Among experiments, large natural video experiments were conducted. To show the power of automatic attention in unknown cluttered backgrounds, the last experimental group demonstrated disjoint tests in the presence of large within-class variations (object 3-D-rotations in very different unknown backgrounds), but small between-class variations (small object patches in large similar and different unknown backgrounds), in contrast with global classification tests such as ImageNet and Atari Games.
Keywords :
object detection; object recognition; video signal processing; 3D rotations; Atari games; DP; ImageNet; WWN; action stream; automatic attention; autonomous development; brain areas; brain-inspired circuits; caretaker supervision; cluttered backgrounds; cluttered natural video; developmental program; dynamic scene; global classification tests; modalities; multimodal streams; natural video experiments; neuronal firing patterns; object location; object scale; self-supervision; task-nonspecificity; unsegmented cluttered scenes; video stream; where-what networks; within-class variations; Brain modeling; Computational modeling; Computer architecture; Neurons; Object recognition; Robot sensing systems; Attention; brain informed; cluttered scene; feature development; invariance; multimodal; neural networks; object detection; object recognition;
fLanguage :
English
Journal_Title :
Autonomous Mental Development, IEEE Transactions on
Publisher :
ieee
ISSN :
1943-0604
Type :
jour
DOI :
10.1109/TAMD.2015.2478377
Filename :
7322214
Link To Document :
بازگشت