• DocumentCode
    629090
  • Title

    Scalable high-dimensional indexing with Hadoop

  • Author

    Shestakov, Denis ; Moise, Diana ; Gudmundsson, Gylfi ; Amsaleg, Laurent

  • Author_Institution
    INRIA Rennes, Rennes, France
  • fYear
    2013
  • fDate
    17-19 June 2013
  • Firstpage
    207
  • Lastpage
    212
  • Abstract
    While high-dimensional search-by-similarity techniques reached their maturity and in overall provide good performance, most of them are unable to cope with very large multimedia collections. The `big data´ challenge however has to be addressed as multimedia collections have been explosively growing and will grow even faster than ever within the next few years. Luckily, computational processing power has become more available to researchers due to easier access to distributed grid infrastructures. In this paper, we show how high-dimensional indexing methods can be used on scientific grid environments and present a scalable workflow for indexing and searching over 30 billion SIFT descriptors using a cluster running Hadoop. Our findings could help other researchers and practitioners to cope with huge multimedia collections.
  • Keywords
    grid computing; indexing; multimedia systems; natural sciences computing; Hadoop; SIFT descriptors; big data challenge; computational processing power; distributed grid infrastructures; high-dimensional search-by-similarity techniques; multimedia collections; scalable high-dimensional indexing; scalable workflow; scientific grid environments; Conferences; Indexing; Multimedia communication; Random access memory; Streaming media; Tuning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Content-Based Multimedia Indexing (CBMI), 2013 11th International Workshop on
  • Conference_Location
    Veszprem
  • ISSN
    1949-3983
  • Print_ISBN
    978-1-4799-0955-1
  • Type

    conf

  • DOI
    10.1109/CBMI.2013.6576584
  • Filename
    6576584