Title :
Reconfigurable models for scene recognition
Author :
Parizi, Sobhan Naderi ; Oberlin, John G. ; Felzenszwalb, Pedro F.
Abstract :
We propose a new latent variable model for scene recognition. Our approach represents a scene as a collection of region models (“parts”) arranged in a reconfigurable pattern. We partition an image into a predefined set of regions and use a latent variable to specify which region model is assigned to each image region. In our current implementation we use a bag of words representation to capture the appearance of an image region. The resulting method generalizes a spatial bag of words approach that relies on a fixed model for the bag of words in each image region. Our models can be trained using both generative and discriminative methods. In the generative setting we use the Expectation-Maximization (EM) algorithm to estimate model parameters from a collection of images with category labels. In the discriminative setting we use a latent structural SVM (LSSVM). We note that LSSVMs can be very sensitive to initialization and demonstrate that generative training with EM provides a good initialization for discriminative training with LSSVM.
Keywords :
expectation-maximisation algorithm; image representation; object recognition; parameter estimation; support vector machines; EM algorithm; LSSVM; bag-of-words representation; discriminative methods; expectation-maximization algorithm; generative methods; latent structural SVM; latent variable model; model parameter estimation; reconfigurable models; region model; scene recognition; support vector machine; Computational modeling; Mathematical model; Support vector machines; Training; Vectors; Visualization; Zirconium;
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on
Conference_Location :
Providence, RI
Print_ISBN :
978-1-4673-1226-4
Electronic_ISBN :
1063-6919
DOI :
10.1109/CVPR.2012.6248001