• DocumentCode
    1914087
  • Title

    A Heterogeneity-aware Data Distribution and Rebalance Method in Hadoop Cluster

  • Author

    Fan, Yuanquan ; Wu, Weiguo ; Cao, Haijun ; Zhu, Huo ; Zhao, Xu ; Wei, Wei

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., Xi´´an, China
  • fYear
    2012
  • fDate
    20-23 Sept. 2012
  • Firstpage
    176
  • Lastpage
    181
  • Abstract
    The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous. Due to the fact that the input data are split into data blocks with a predefined block size, Hadoop suffers performance degradation during Map phase in heterogeneous cluster. To solve this problem, we propose a heterogeneity-aware data distribution and rebalance method in heterogeneous Hadoop cluster. The method consists of two aspects: 1) performance-aware data distribution, and 2) dynamic data migration. The experimental results indicate that our method can improve the Map performance in heterogeneous cluster. Furthermore, the data locality of the Map task is enhanced as well.
  • Keywords
    cloud computing; data handling; pattern clustering; cloud computing; cluster computing nodes; dynamic data migration; heterogeneity-aware data distribution; heterogeneous Hadoop cluster; heterogeneous cluster; map phase; map task data locality; performance-aware data distribution; rebalance method; Benchmark testing; Computational modeling; Data models; Heuristic algorithms; Linux; Nickel; Time factors; Hadoop; MapReduce; data locality; heterogeneity-aware; rebalance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    ChinaGrid Annual Conference (ChinaGrid), 2012 Seventh
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4673-2623-0
  • Electronic_ISBN
    978-0-7695-4816-6
  • Type

    conf

  • DOI
    10.1109/ChinaGrid.2012.22
  • Filename
    6337296