DocumentCode :
2054188
Title :
Improving MapReduce Performance via Heterogeneity-Load-Aware Partition Function
Author :
Sun, Huifeng ; Chen, Junliang ; Liu, Chuanchang ; Zheng, Zibin ; Yu, Nan ; Yang, Zhi
Author_Institution :
State Key Lab. of Networking & Switching Technol., Beijing Univ. of Posts & Telecommun., Beijing, China
fYear :
2011
fDate :
26-30 Sept. 2011
Firstpage :
557
Lastpage :
560
Abstract :
MapReduce is an important programming model for large-scale data-intensive applications such as web indexing, scientific simulation, and data mining. Hadoop is an open-source implementation of MapReduce enjoying wide adoption. Partition function is an important component of Hadoop which split outputs of maps into bulks that place the input data of reduces. Based on the assumptions that cluster nodes are homogeneous and perform work at roughly the same rate, its default partition function splits intermediate keys into reduces. However, in practice the homogeneity assumptions seldom hold and cluster nodes usually perform work at different rate. In this paper, we design a heterogeneity-load-aware partition function named proportional partition function (PPF). Besides the dynamic loading of cluster nodes, PPF considers the capacity diversity of cluster nodes such as CPU processing speed and disk writing speed.
Keywords :
distributed processing; public domain software; CPU processing speed; Hadoop; MapReduce performance improvement; Web indexing; data mining; disk writing speed; dynamic loading; heterogeneity-load-aware partition function; large-scale data-intensive application; open-source implementation; programming model; proportional partition function; scientific simulation; Computers; Data mining; Degradation; Dynamic scheduling; Educational institutions; Loading; Hadoop; MapReduce; erogeneity; partition function; scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing (CLUSTER), 2011 IEEE International Conference on
Conference_Location :
Austin, TX
Print_ISBN :
978-1-4577-1355-2
Electronic_ISBN :
978-0-7695-4516-5
Type :
conf
DOI :
10.1109/CLUSTER.2011.68
Filename :
6061207
Link To Document :
بازگشت