Title :
Where-what network-4: The effect of multiple internal areas
Author :
Luciw, Matthew ; Weng, Juyang
Author_Institution :
Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
Abstract :
The general visual attention-recognition (AR) problem remains open. Given a set of images, each with a single target foreground over some complex background, it requires output of both location and type of this single foreground. First, many approaches cannot deal with the richness of the class of possible backgrounds, which has a huge number of variations and also could include distractor-like patterns. This potentially leads to an explosion of resources required within the model. Second, all current approaches break down as the number of locations, types, and variations (within each type) increases towards human-level. This paper is concerned with model selection for networks dealing with the general AR problem. The major challenge is ensuring the model remains as simple as possible as the complexity of the data increases. In developmental general AR, the model must be adapted on the fly. We discuss these issues in context of the latest version of the biologically-inspired developmental Where-What Network. We show how local detectors reduce the number of neurons exponentially and deal with the complex background problem. The purpose of multiple layers seems to be to allow combinatorial patterns to emerge. Top-down connections cause more discriminative features to develop, but since complex data requires a bank of shared features, top-down connections are probably not beneficial for the early layer(s). When a layer´s features are class-specific and there is no combinatorial structure to exploit on top of this layer, it is not useful to add another layer but it is useful to utilize top-down connections to develop more discriminative features.
Keywords :
combinatorial mathematics; feature extraction; image recognition; set theory; biologically-inspired developmental where-what network; combinatorial pattern; complex background; data complexity; discriminative feature; distractor-like pattern; image set; local detector; multiple internal area; visual attention-recognition problem; Entropy; Neurons; Pixel; Retina; Testing; Training; Visualization;
Conference_Titel :
Development and Learning (ICDL), 2010 IEEE 9th International Conference on
Conference_Location :
Ann Arbor, MI
Print_ISBN :
978-1-4244-6900-0
DOI :
10.1109/DEVLRN.2010.5578824