Title :
Modeling local and global deformations in Deep Learning: Epitomic convolution, Multiple Instance Learning, and sliding window detection
Author :
George Papandreou;Iasonas Kokkinos;Pierre-André Savalle
Author_Institution :
Google, USA
fDate :
6/1/2015 12:00:00 AM
Abstract :
Deep Convolutional Neural Networks (DCNNs) achieve invariance to domain transformations (deformations) by using multiple `max-pooling´ (MP) layers. In this work we show that alternative methods of modeling deformations can improve the accuracy and efficiency of DCNNs. First, we introduce epitomic convolution as an alternative to the common convolution-MP cascade of DCNNs, that comes with the same computational cost but favorable learning properties. Second, we introduce a Multiple Instance Learning algorithm to accommodate global translation and scaling in image classification, yielding an efficient algorithm that trains and tests a DCNN in a consistent manner. Third we develop a DCNN sliding window detector that explicitly, but efficiently, searches over the object´s position, scale, and aspect ratio. We provide competitive image classification and localization results on the ImageNet dataset and object detection results on Pascal VOC2007.
Keywords :
"Convolution","Training","Computational modeling","Accuracy","Deformable models","Data structures"
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
Electronic_ISBN :
1063-6919
DOI :
10.1109/CVPR.2015.7298636