• DocumentCode
    3406906
  • Title

    Online visual vocabulary pruning using pairwise constraints

  • Author

    Mallapragada, Pavan K. ; Jin, Rong ; Jain, Anil K.

  • Author_Institution
    Dept. Comput. Sci. & Eng., Michigan St. Univ., East Lansing, MI, USA
  • fYear
    2010
  • fDate
    13-18 June 2010
  • Firstpage
    3073
  • Lastpage
    3080
  • Abstract
    Given a pair of images represented using bag-of-visual-words and a label corresponding to whether the images are “related”(must-link constraint) or “unrelated” (cannot-link constraint), we address the problem of selecting a subset of visual words that are salient in explaining the relation between the image pair. In particular, a subset of features is selected such that the distance computed using these features satisfies the given pairwise constraints. An efficient online feature selection algorithm is presented based on the dual-gradient descent approach. Side information in the form of pair-wise constraints is incorporated into the feature selection stage, providing the user with flexibility to use an unsupervised or semi-supervised algorithm at a later stage. Correlated subsets of visual words, usually resulting from hierarchical quantization process (called groups), are exploited to select a significantly smaller vocabulary. A group-LASSO regularizer is used to drive as many feature weights to zero as possible. We evaluate the quality of the pruned vocabulary by clustering the data using the resulting feature subset. Experiments on PASCAL VOC 2007 dataset using 5000 visual keywords, resulted in around 80% reduction in the number of keywords, with little or no loss in performance.
  • Keywords
    gradient methods; image representation; pattern clustering; quantisation (signal); vocabulary; PASCAL VOC 2007 dataset; bag-of-visual-words; cannot-link constraint; data clustering; dual-gradient descent approach; group-LASSO regularizer; hierarchical quantization process; image representation; must-link constraint; online feature selection algorithm; online visual vocabulary pruning; pairwise constraint; pruned vocabulary; semi-supervised algorithm; unsupervised algorithm; visual words; Clustering algorithms; Computer science; Image retrieval; Large-scale systems; Partitioning algorithms; Performance loss; Prototypes; Quantization; Scalability; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
  • Conference_Location
    San Francisco, CA
  • ISSN
    1063-6919
  • Print_ISBN
    978-1-4244-6984-0
  • Type

    conf

  • DOI
    10.1109/CVPR.2010.5540062
  • Filename
    5540062