• DocumentCode
    3016301
  • Title

    Learning Visual Representations using Images with Captions

  • Author

    Quattoni, Ariadna ; Collins, Michael ; Darrell, Trevor

  • Author_Institution
    MIT, Cambridge
  • fYear
    2007
  • fDate
    17-22 June 2007
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Current methods for learning visual categories work well when a large amount of labeled data is available, but can run into severe difficulties when the number of labeled examples is small. When labeled data is scarce it may be beneficial to use unlabeled data to learn an image representation that is low-dimensional, but nevertheless captures the information required to discriminate between image categories. This paper describes a method for learning representations from large quantities of unlabeled images which have associated captions; the goal is to improve learning in future image classification problems. Experiments show that our method significantly outperforms (1) a fully-supervised baseline model, (2) a model that ignores the captions and learns a visual representation by performing PCA on the unlabeled images alone and (3) a model that uses the output of word classifiers trained using captions and unlabeled data. Our current work concentrates on captions as the source of meta-data, but more generally other types of meta-data could be used.
  • Keywords
    image classification; image representation; learning (artificial intelligence); image caption; image category; image classification; image representation learning; labeled data; meta-data; unlabeled image; visual category learning; visual representation learning; Artificial intelligence; Computer science; Image classification; Image representation; Laboratories; Learning; Natural languages; Principal component analysis; Training data; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
  • Conference_Location
    Minneapolis, MN
  • ISSN
    1063-6919
  • Print_ISBN
    1-4244-1179-3
  • Electronic_ISBN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2007.383173
  • Filename
    4270198