• DocumentCode
    2914808
  • Title

    Spatial-DiscLDA for visual recognition

  • Author

    Niu, Zhenxing ; Hua, Gang ; Gao, Xinbo ; Tian, Qi

  • fYear
    2011
  • fDate
    20-25 June 2011
  • Firstpage
    1769
  • Lastpage
    1776
  • Abstract
    Topic models such as pLSA, LDA and their variants have been widely adopted for visual recognition. However, most of the adopted models, if not all, are unsupervised, which neglected the valuable supervised labels during model training. In this paper, we exploit recent advancement in supervised topic modeling, more particularly, the DiscLDA model for object recognition. We extend it to a part based visual representation to automatically identify and model different object parts. We call the proposed model as Spatial-DiscLDA (S-DiscLDA). It models the appearances and locations of the object parts simultaneously, which also takes the supervised labels into consideration. It can be directly used as a classifier to recognize the object. This is performed by an approximate inference algorithm based on Gibbs sampling and bridge sampling methods. We examine the performance of our model by comparing its performance with another supervised topic model on two scene category datasets, i.e., LabelMe and UIUC-sport dataset. We also compare our approach with other approaches which model spatial structures of visual features on the popular Caltech-4 dataset. The experimental results illustrate that it provides competitive performance.
  • Keywords
    image sampling; inference mechanisms; natural language processing; object recognition; probability; Caltech-4 dataset; Gibbs sampling method; LabelMe dataset; UIUC-sport dataset; approximate inference algorithm; bridge sampling methods; latent Dirichlet allocation; object recognition; pLSA; probabilistic latent semantic analysis; scene category datasets; spatial-DiscLDA model; supervised topic modeling; valuable supervised labels; visual recognition; visual representation; Computational modeling; Inference algorithms; Mathematical model; Neodymium; Object recognition; Training; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on
  • Conference_Location
    Providence, RI
  • ISSN
    1063-6919
  • Print_ISBN
    978-1-4577-0394-2
  • Type

    conf

  • DOI
    10.1109/CVPR.2011.5995426
  • Filename
    5995426