• DocumentCode
    52163
  • Title

    Extended Coding and Pooling in the HMAX Model

  • Author

    Theriault, Christian ; Thome, Nicolas ; Cord, Matthieu

  • Author_Institution
    Univ. Pierre et Marie Curie, Paris, France
  • Volume
    22
  • Issue
    2
  • fYear
    2013
  • fDate
    Feb. 2013
  • Firstpage
    764
  • Lastpage
    777
  • Abstract
    This paper presents an extension of the HMAX model, a neural network model for image classification. The HMAX model can be described as a four-level architecture, with the first level consisting of multiscale and multiorientation local filters. We introduce two main contributions to this model. First, we improve the way the local filters at the first level are integrated into more complex filters at the last level, providing a flexible description of object regions and combining local information of multiple scales and orientations. These new filters are discriminative and yet invariant, two key aspects of visual classification. We evaluate their discriminative power and their level of invariance to geometrical transformations on a synthetic image set. Second, we introduce a multiresolution spatial pooling. This pooling encodes both local and global spatial information to produce discriminative image signatures. Classification results are reported on three image data sets: Caltech101, Caltech256, and fifteen scenes. We show significant improvements over previous architectures using a similar framework.
  • Keywords
    filtering theory; image classification; image coding; neural nets; Caltech101; Caltech256; HMAX model; discriminative image signatures; discriminative power; extended coding; four-level architecture; geometrical transformations; image classification; image data sets; multiorientation local filters; multiresolution spatial pooling; multiscale local filters; neural network model; object region flexible description; synthetic image set; visual classification; Biological system modeling; Brain modeling; Convolution; Equations; Prototypes; Training; Visualization; Convolutional network; multiscale; object recognition; spatial pooling; vision;
  • fLanguage
    English
  • Journal_Title
    Image Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1057-7149
  • Type

    jour

  • DOI
    10.1109/TIP.2012.2222900
  • Filename
    6324437