DocumentCode
1914087
Title
A Heterogeneity-aware Data Distribution and Rebalance Method in Hadoop Cluster
Author
Fan, Yuanquan ; Wu, Weiguo ; Cao, Haijun ; Zhu, Huo ; Zhao, Xu ; Wei, Wei
Author_Institution
Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., Xi´´an, China
fYear
2012
fDate
20-23 Sept. 2012
Firstpage
176
Lastpage
181
Abstract
The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous. Due to the fact that the input data are split into data blocks with a predefined block size, Hadoop suffers performance degradation during Map phase in heterogeneous cluster. To solve this problem, we propose a heterogeneity-aware data distribution and rebalance method in heterogeneous Hadoop cluster. The method consists of two aspects: 1) performance-aware data distribution, and 2) dynamic data migration. The experimental results indicate that our method can improve the Map performance in heterogeneous cluster. Furthermore, the data locality of the Map task is enhanced as well.
Keywords
cloud computing; data handling; pattern clustering; cloud computing; cluster computing nodes; dynamic data migration; heterogeneity-aware data distribution; heterogeneous Hadoop cluster; heterogeneous cluster; map phase; map task data locality; performance-aware data distribution; rebalance method; Benchmark testing; Computational modeling; Data models; Heuristic algorithms; Linux; Nickel; Time factors; Hadoop; MapReduce; data locality; heterogeneity-aware; rebalance;
fLanguage
English
Publisher
ieee
Conference_Titel
ChinaGrid Annual Conference (ChinaGrid), 2012 Seventh
Conference_Location
Beijing
Print_ISBN
978-1-4673-2623-0
Electronic_ISBN
978-0-7695-4816-6
Type
conf
DOI
10.1109/ChinaGrid.2012.22
Filename
6337296
Link To Document