• DocumentCode
    2136018
  • Title

    Optimal Tradeoff between Energy Consumption and Response Time in Large-Scale MapReduce Clusters

  • Author

    Paraskevopoulos, Pavlos ; Gounaris, Anastasios

  • Author_Institution
    Dept. of Inf., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece
  • fYear
    2011
  • fDate
    Sept. 30 2011-Oct. 2 2011
  • Firstpage
    144
  • Lastpage
    148
  • Abstract
    The increasing growth of the size of the digital databases has given rise to the need for the development of infrastructures, such as large scale data centers and computational clusters, which are capable of storing and processing very large volumes of data. To date, most clusters have been designed for performance. Due to non-linear speed-ups that are common to typical applications, performance maximization involves the decision of the number of the nodes to process a specific (intensive) task, as opposed to the utilization of the full cluster. In addition, energy consumption has recently attracted significant attention, given that the cost to operate a cluster may well exceed its acquisition cost. This issue calls for judicious use of resources as well. The aim of this study is to present a method that achieves the optimal tradeoff between energy consumption and response time in distributed clusters, such as Map Reduce clusters. To this end, we propose an algorithm that derives the fraction of the nodes that minimizes the energy consumption without sacrificing performance (in terms of response time) more than a user-defined threshold. Moreover, we present a generic and configurable framework to describe performance and energy consumption as a function of the nodes used, our framework can accommodate the widely spread Map Reduce-like parallel executions in a straightforward manner. The evaluation results show that our methodology can lead to significant energy savings with acceptable performance penalty in many realistic situations.
  • Keywords
    distributed databases; computational clusters; digital databases; energy consumption; large scale data centers; large-scale MapReduce clusters; optimal tradeoff; performance maximization; response time; Clustering algorithms; Degradation; Energy consumption; Informatics; Parallel processing; Servers; Time factors; energy consumption; map-reduce; response time;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Informatics (PCI), 2011 15th Panhellenic Conference on
  • Conference_Location
    Kastonia
  • Print_ISBN
    978-1-61284-962-1
  • Type

    conf

  • DOI
    10.1109/PCI.2011.30
  • Filename
    6065041