Title :
DASCA: Data Aware Scaling Down to provide power proportionality for distributed data processing frameworks
Author :
Kim, Hyeong S. ; Shin, Dong In ; Yu, Young Jin ; Eom, Hyeonsang ; Yeom, Heon Y.
Author_Institution :
Sch. of Comput. Sci. & Eng., Seoul Nat. Univ., Seoul, South Korea
Abstract :
Distributed systems have led to the adoption of cloud computing concepts among countless enterprises. A large number of companies have already benefited from delegating IT services to cloud service providers. At the same time, the interest on energy efficiency has dramatically increased. Energy efficiency in large distributed systems is a big concern for system engineers. In addition, the proliferation of distributed data processing frameworks such as MapReduce have led to a vast amount of research and practices. In this paper, we are particularly interested in providing energy proportionality for MapReduce. To provide energy proportionality, we propose Data Aware Scaling Down (DASCA), a scaling down framework for MapReduce. There are two problems we must address in order to support scaling down for MapReduce. The first is to choose a proper set of nodes to suspend, which we call candidate set. The second is to minimize the replica redistribution which occurs during the initiation of power save mode. To address these problems, we use the data awareness of the MapReduce framework. To address the first problem, we provide two greedy algorithms which exploit the data awareness of MapReduce. To address the second problem, we propose locality aware replica redistribution to efficiently redistribute the lost replicas while preserving the availability of replicas and performance of distributed processing.
Keywords :
cloud computing; distributed processing; information technology; power aware computing; DASCA; IT services; MapReduce framework; cloud computing; cloud service providers; data aware scaling down; distributed data processing frameworks; distributed processing; distributed systems; energy efficiency; energy proportionality; power proportionality; power save mode; replica redistribution; Analytical models; Availability; Computational modeling; Data processing; Distributed databases; Mathematical model; Servers; data awareness; distributed data processing; energy proportionality;
Conference_Titel :
Green Computing Conference and Workshops (IGCC), 2011 International
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-4577-1222-7
DOI :
10.1109/IGCC.2011.6008551