Title :
Scalable high-dimensional indexing with Hadoop
Author :
Shestakov, Denis ; Moise, Diana ; Gudmundsson, Gylfi ; Amsaleg, Laurent
Author_Institution :
INRIA Rennes, Rennes, France
Abstract :
While high-dimensional search-by-similarity techniques reached their maturity and in overall provide good performance, most of them are unable to cope with very large multimedia collections. The `big data´ challenge however has to be addressed as multimedia collections have been explosively growing and will grow even faster than ever within the next few years. Luckily, computational processing power has become more available to researchers due to easier access to distributed grid infrastructures. In this paper, we show how high-dimensional indexing methods can be used on scientific grid environments and present a scalable workflow for indexing and searching over 30 billion SIFT descriptors using a cluster running Hadoop. Our findings could help other researchers and practitioners to cope with huge multimedia collections.
Keywords :
grid computing; indexing; multimedia systems; natural sciences computing; Hadoop; SIFT descriptors; big data challenge; computational processing power; distributed grid infrastructures; high-dimensional search-by-similarity techniques; multimedia collections; scalable high-dimensional indexing; scalable workflow; scientific grid environments; Conferences; Indexing; Multimedia communication; Random access memory; Streaming media; Tuning;
Conference_Titel :
Content-Based Multimedia Indexing (CBMI), 2013 11th International Workshop on
Conference_Location :
Veszprem
Print_ISBN :
978-1-4799-0955-1
DOI :
10.1109/CBMI.2013.6576584