DocumentCode
2054188
Title
Improving MapReduce Performance via Heterogeneity-Load-Aware Partition Function
Author
Sun, Huifeng ; Chen, Junliang ; Liu, Chuanchang ; Zheng, Zibin ; Yu, Nan ; Yang, Zhi
Author_Institution
State Key Lab. of Networking & Switching Technol., Beijing Univ. of Posts & Telecommun., Beijing, China
fYear
2011
fDate
26-30 Sept. 2011
Firstpage
557
Lastpage
560
Abstract
MapReduce is an important programming model for large-scale data-intensive applications such as web indexing, scientific simulation, and data mining. Hadoop is an open-source implementation of MapReduce enjoying wide adoption. Partition function is an important component of Hadoop which split outputs of maps into bulks that place the input data of reduces. Based on the assumptions that cluster nodes are homogeneous and perform work at roughly the same rate, its default partition function splits intermediate keys into reduces. However, in practice the homogeneity assumptions seldom hold and cluster nodes usually perform work at different rate. In this paper, we design a heterogeneity-load-aware partition function named proportional partition function (PPF). Besides the dynamic loading of cluster nodes, PPF considers the capacity diversity of cluster nodes such as CPU processing speed and disk writing speed.
Keywords
distributed processing; public domain software; CPU processing speed; Hadoop; MapReduce performance improvement; Web indexing; data mining; disk writing speed; dynamic loading; heterogeneity-load-aware partition function; large-scale data-intensive application; open-source implementation; programming model; proportional partition function; scientific simulation; Computers; Data mining; Degradation; Dynamic scheduling; Educational institutions; Loading; Hadoop; MapReduce; erogeneity; partition function; scheduling;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing (CLUSTER), 2011 IEEE International Conference on
Conference_Location
Austin, TX
Print_ISBN
978-1-4577-1355-2
Electronic_ISBN
978-0-7695-4516-5
Type
conf
DOI
10.1109/CLUSTER.2011.68
Filename
6061207
Link To Document