• DocumentCode
    3739490
  • Title

    SmartFetch: Efficient Support for Selective Queries

  • Author

    Manuel Ferreira;Jo?o ;Manuel Bravo;Lu?s

  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    The paper proposes SmartFetch, a storage strategy that relies on a combination of techniques aimed at efficiently supporting selective jobs that are only concerned with a subset of the entire dataset in systems such as Hadoop and Spark. We combine the use of an appropriate data-layout with data indexing tools to improve the data access speed and significantly shorten total job execution time. An extensive experimental evaluation of SmartFetch shows that, by avoiding reading irrelevant blocks, it can provide significant speedups when compared to the basic Hadoop and Spark implementations. Further, our system also outperforms other implementations that use several variants of the techniques we have embedded in SmartFetch.
  • Keywords
    "Sparks","Indexing","Big data","Twitter","Layout","Computers"
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing Technology and Science (CloudCom), 2015 IEEE 7th International Conference on
  • Type

    conf

  • DOI
    10.1109/CloudCom.2015.83
  • Filename
    7396131