DocumentCode
52163
Title
Extended Coding and Pooling in the HMAX Model
Author
Theriault, Christian ; Thome, Nicolas ; Cord, Matthieu
Author_Institution
Univ. Pierre et Marie Curie, Paris, France
Volume
22
Issue
2
fYear
2013
fDate
Feb. 2013
Firstpage
764
Lastpage
777
Abstract
This paper presents an extension of the HMAX model, a neural network model for image classification. The HMAX model can be described as a four-level architecture, with the first level consisting of multiscale and multiorientation local filters. We introduce two main contributions to this model. First, we improve the way the local filters at the first level are integrated into more complex filters at the last level, providing a flexible description of object regions and combining local information of multiple scales and orientations. These new filters are discriminative and yet invariant, two key aspects of visual classification. We evaluate their discriminative power and their level of invariance to geometrical transformations on a synthetic image set. Second, we introduce a multiresolution spatial pooling. This pooling encodes both local and global spatial information to produce discriminative image signatures. Classification results are reported on three image data sets: Caltech101, Caltech256, and fifteen scenes. We show significant improvements over previous architectures using a similar framework.
Keywords
filtering theory; image classification; image coding; neural nets; Caltech101; Caltech256; HMAX model; discriminative image signatures; discriminative power; extended coding; four-level architecture; geometrical transformations; image classification; image data sets; multiorientation local filters; multiresolution spatial pooling; multiscale local filters; neural network model; object region flexible description; synthetic image set; visual classification; Biological system modeling; Brain modeling; Convolution; Equations; Prototypes; Training; Visualization; Convolutional network; multiscale; object recognition; spatial pooling; vision;
fLanguage
English
Journal_Title
Image Processing, IEEE Transactions on
Publisher
ieee
ISSN
1057-7149
Type
jour
DOI
10.1109/TIP.2012.2222900
Filename
6324437
Link To Document