• DocumentCode
    3691823
  • Title

    Evaluating cluster configurations for big data processing: an exploratory study

  • Author

    Roni Sandel;Mark Shtern;Marios Fokaefs;Marin Litoiu

  • Author_Institution
    York University
  • fYear
    2015
  • fDate
    10/2/2015 12:00:00 AM
  • Firstpage
    23
  • Lastpage
    30
  • Abstract
    As data continues to grow rapidly, NoSQL clusters have been increasingly adopted to address the storage and processing demands of these large amounts of data. In parallel, cloud computing is also increasingly being adopted due to its flexibility, cost efficiency and scalability. However, evaluating and modelling NoSQL clusters present many challenges. In this work, we explore these challenges by performing a series of experiments with various configurations. The intuition is that this process is laborious and expensive and the goal of our experiments is to confirm this intuition and to identify the factors that impact the performance of a Big Data cluster. Our experiments mostly focus on three factors: data compression, data schema and cluster topology. We performed a number of experiments based on these factors and measured and compared the response times of the resulting configurations. Eventually, the outcomes of our study are encapsulated in a performance model that predicts the cluster´s response time as a function of the incoming workload and evaluates the cluster´s performance less costly and faster. This systematic and effortless evaluation method will facilitate the selection and migration to a better cluster as the performance and budget goals change. We use HBase as the large data processing cluster and we conduct our experiments on traffic data from a large city and on a distributed community cloud infrastructure.
  • Keywords
    "Topology","Time factors","Cloud computing","Big data","Data compression","Virtual machining","Predictive models"
  • Publisher
    ieee
  • Conference_Titel
    Maintenance and Evolution of Service-Oriented and Cloud-Based Environments (MESOCA), 2015 IEEE 9th International Symposium on the
  • Electronic_ISBN
    2326-6937
  • Type

    conf

  • DOI
    10.1109/MESOCA.2015.7328122
  • Filename
    7328122