• DocumentCode
    3672100
  • Title

    Modeling local and global deformations in Deep Learning: Epitomic convolution, Multiple Instance Learning, and sliding window detection

  • Author

    George Papandreou;Iasonas Kokkinos;Pierre-André Savalle

  • Author_Institution
    Google, USA
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    390
  • Lastpage
    399
  • Abstract
    Deep Convolutional Neural Networks (DCNNs) achieve invariance to domain transformations (deformations) by using multiple `max-pooling´ (MP) layers. In this work we show that alternative methods of modeling deformations can improve the accuracy and efficiency of DCNNs. First, we introduce epitomic convolution as an alternative to the common convolution-MP cascade of DCNNs, that comes with the same computational cost but favorable learning properties. Second, we introduce a Multiple Instance Learning algorithm to accommodate global translation and scaling in image classification, yielding an efficient algorithm that trains and tests a DCNN in a consistent manner. Third we develop a DCNN sliding window detector that explicitly, but efficiently, searches over the object´s position, scale, and aspect ratio. We provide competitive image classification and localization results on the ImageNet dataset and object detection results on Pascal VOC2007.
  • Keywords
    "Convolution","Training","Computational modeling","Accuracy","Deformable models","Data structures"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
  • Electronic_ISBN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2015.7298636
  • Filename
    7298636