• DocumentCode
    3138316
  • Title

    DASCA: Data Aware Scaling Down to provide power proportionality for distributed data processing frameworks

  • Author

    Kim, Hyeong S. ; Shin, Dong In ; Yu, Young Jin ; Eom, Hyeonsang ; Yeom, Heon Y.

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Seoul Nat. Univ., Seoul, South Korea
  • fYear
    2011
  • fDate
    25-28 July 2011
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Distributed systems have led to the adoption of cloud computing concepts among countless enterprises. A large number of companies have already benefited from delegating IT services to cloud service providers. At the same time, the interest on energy efficiency has dramatically increased. Energy efficiency in large distributed systems is a big concern for system engineers. In addition, the proliferation of distributed data processing frameworks such as MapReduce have led to a vast amount of research and practices. In this paper, we are particularly interested in providing energy proportionality for MapReduce. To provide energy proportionality, we propose Data Aware Scaling Down (DASCA), a scaling down framework for MapReduce. There are two problems we must address in order to support scaling down for MapReduce. The first is to choose a proper set of nodes to suspend, which we call candidate set. The second is to minimize the replica redistribution which occurs during the initiation of power save mode. To address these problems, we use the data awareness of the MapReduce framework. To address the first problem, we provide two greedy algorithms which exploit the data awareness of MapReduce. To address the second problem, we propose locality aware replica redistribution to efficiently redistribute the lost replicas while preserving the availability of replicas and performance of distributed processing.
  • Keywords
    cloud computing; distributed processing; information technology; power aware computing; DASCA; IT services; MapReduce framework; cloud computing; cloud service providers; data aware scaling down; distributed data processing frameworks; distributed processing; distributed systems; energy efficiency; energy proportionality; power proportionality; power save mode; replica redistribution; Analytical models; Availability; Computational modeling; Data processing; Distributed databases; Mathematical model; Servers; data awareness; distributed data processing; energy proportionality;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Green Computing Conference and Workshops (IGCC), 2011 International
  • Conference_Location
    Orlando, FL
  • Print_ISBN
    978-1-4577-1222-7
  • Type

    conf

  • DOI
    10.1109/IGCC.2011.6008551
  • Filename
    6008551