• DocumentCode
    443187
  • Title

    A probabilistic semantic model for image annotation and multimodal image retrieval

  • Author

    Zhang, Ruofei ; Zhang, Zhongfei ; Li, Mingjing ; Ma, Wei-Ying ; Zhang, Hong-Jiang

  • Author_Institution
    dept. of Comput. Sci., State Univ. of New York, Binghamton, NY, USA
  • Volume
    1
  • fYear
    2005
  • fDate
    17-21 Oct. 2005
  • Firstpage
    846
  • Abstract
    This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7,736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.
  • Keywords
    Web sites; content-based retrieval; expectation-maximisation algorithm; feature extraction; image retrieval; image texture; semantic networks; Bayesian framework; automatic image annotation; crawled Web pages; expectation-maximization algorithm; image collection; iterative learning procedure; multimodal image retrieval; probabilistic semantic model; textual words; visual features; word layer; Application software; Asia; Bayesian methods; Computer science; Content based retrieval; Image databases; Image retrieval; Information retrieval; Large-scale systems; Prototypes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
  • ISSN
    1550-5499
  • Print_ISBN
    0-7695-2334-X
  • Type

    conf

  • DOI
    10.1109/ICCV.2005.16
  • Filename
    1541341