• DocumentCode
    3717469
  • Title

    Indexing media storms on Flink

  • Author

    Dimitrios Rafailidis;Stefanos Antaris

  • Author_Institution
    Department of Informatics, Aristotle University of Thessaloniki
  • fYear
    2015
  • Firstpage
    2836
  • Lastpage
    2838
  • Abstract
    We propose a media storm indexing algorithm using Map-Reduce in our recently proposed CDVC framework. In this study, CDVC is built on Flink, an open-source platform for stream data processing. The question we answer is how to store massive image collections; for instance, with over one million images per second, as well as with varying incoming rate. In our experiments with two benchmark datasets of 80M and 1B image descriptors, we evaluate the proposed algorithm on different indexing workloads, that is, images that come with high volume and different velocity at the scale of 105-106 images per second. Using a limited set of computational nodes, we show that we achieve a significant speed up factor of nine, on average, compared to conventional indexing techniques, in all settings. Finally, we make our source code publicly available.
  • Keywords
    "Indexing","Media","Storms","Gaussian distribution","Standards","Big data"
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2015.7364094
  • Filename
    7364094