DocumentCode
629090
Title
Scalable high-dimensional indexing with Hadoop
Author
Shestakov, Denis ; Moise, Diana ; Gudmundsson, Gylfi ; Amsaleg, Laurent
Author_Institution
INRIA Rennes, Rennes, France
fYear
2013
fDate
17-19 June 2013
Firstpage
207
Lastpage
212
Abstract
While high-dimensional search-by-similarity techniques reached their maturity and in overall provide good performance, most of them are unable to cope with very large multimedia collections. The `big data´ challenge however has to be addressed as multimedia collections have been explosively growing and will grow even faster than ever within the next few years. Luckily, computational processing power has become more available to researchers due to easier access to distributed grid infrastructures. In this paper, we show how high-dimensional indexing methods can be used on scientific grid environments and present a scalable workflow for indexing and searching over 30 billion SIFT descriptors using a cluster running Hadoop. Our findings could help other researchers and practitioners to cope with huge multimedia collections.
Keywords
grid computing; indexing; multimedia systems; natural sciences computing; Hadoop; SIFT descriptors; big data challenge; computational processing power; distributed grid infrastructures; high-dimensional search-by-similarity techniques; multimedia collections; scalable high-dimensional indexing; scalable workflow; scientific grid environments; Conferences; Indexing; Multimedia communication; Random access memory; Streaming media; Tuning;
fLanguage
English
Publisher
ieee
Conference_Titel
Content-Based Multimedia Indexing (CBMI), 2013 11th International Workshop on
Conference_Location
Veszprem
ISSN
1949-3983
Print_ISBN
978-1-4799-0955-1
Type
conf
DOI
10.1109/CBMI.2013.6576584
Filename
6576584
Link To Document