• DocumentCode
    3530702
  • Title

    Searching by Similarity and Classifying Images on a Very Large Scale

  • Author

    Amato, Giuseppe ; Savino, Pasquale

  • Author_Institution
    Ist. di Scienza e Tecnol. dell´´Inf., ISTI-CNR Pisa, Pisa, Italy
  • fYear
    2009
  • fDate
    29-30 Aug. 2009
  • Firstpage
    149
  • Lastpage
    150
  • Abstract
    In the demonstration we will show a system for searching by similarity and automatically classifying images in a very large dataset. The demonstrated techniques are based on the use of the MI-File (Metric Inverted File) as the access method for executing similarity search efficiently. The MI-File is an access methods based on inverted files that relies on a space transformation that use the notion of perspective to decide about the similarity between two objects. More specifically, if two objects are close one to each other, also the view of the space from their position is similar. Leveraging on this space transformation, it is possible to use inverted file to execute approximate similarity search. In order to test the scalability of this access method, we inserted 106 millions images from the CoPhIR dataset and we created an on-line search engine that allows everybody to search in this dataset. In addition we also used this access methods to perform automatic classification on this very large image dataset. More specifically, we reformulated the classification problem, as resulting from the use of SVM with RBF kernel, as a complex approximate similarity search problem. In such a way, instead of comparing every single image against the classifier, the best images belonging to a class are directly obtained as the result of a complex approximate similarity search query.
  • Keywords
    content-based retrieval; image classification; image retrieval; search engines; very large databases; content based image retrieval; image classification; metric inverted file; search engine; similarity search; very large dataset; Content based retrieval; Image classification; Image retrieval; Kernel; Large-scale systems; Search engines; Search problems; Support vector machine classification; Support vector machines; Testing; image; image content based retrieval; similarity search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Similarity Search and Applications, 2009. SISAP '09. Second International Workshop on
  • Conference_Location
    Prague
  • Print_ISBN
    978-0-7695-3765-8
  • Type

    conf

  • DOI
    10.1109/SISAP.2009.10
  • Filename
    5271938