• DocumentCode
    604403
  • Title

    K%-Fair scheduling: A flexible task scheduling strategy for balancing fairness and efficiency in MapReduce systems

  • Author

    Hui Zhao ; Shuqiang Yang ; Zhikun Chen ; Hua Fan ; Jinghu Xu

  • Author_Institution
    Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2012
  • fDate
    29-31 Dec. 2012
  • Firstpage
    629
  • Lastpage
    633
  • Abstract
    MapReduce is an important programming paradigm on big data-intensive computing using share-nothing cluster containing ten of thousands of nodes, in which computing nodes also acts as storage nodes. Since tasks belonging to different jobs are physical executing entities scattered among the whole cluster, task scheduling plays a crucial role in MapReduce systems. For data consolidation and utilization, MapReduce cluster is usually used as an shared computing environment rather than several private clusters. Typical workloads consist of concurrent jobs, which include interactive jobs and batch jobs, so fairness is an important target in this scenario. On the other hand, efficiency is also an vital concern for cluster owner, data locality is used as a heuristic to achieve high efficiency. To achieve both goals is a huge challenge, requiring extensive research work. State of the art schedulers cannot well solve this problem. In this paper, we proposed K%-Fair scheduling, a flexible task scheduling strategy, based on multiple task queues on node level, according to fairness and data locality. Finally, we evaluate our scheduling on data locality and fairness among jobs, it improves data locality much more, in the same time, fairness is kept on nearly the same.
  • Keywords
    parallel programming; processor scheduling; K%-Fair scheduling; MapReduce cluster; MapReduce programming paradigm; MapReduce system efficiency balancing; MapReduce system fairness balancing; batch jobs; computing nodes; concurrent jobs; data consolidation; data locality; data locality improvement; data utilization; data-intensive computing; flexible task scheduling strategy; heuristics; interactive jobs; multiple task queues; node level; physical executing entities; share-nothing cluster; shared computing environment; storage nodes; Algorithms; Big data; Fairness; Hadoop; Locality; MapReduce; Scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Network Technology (ICCSNT), 2012 2nd International Conference on
  • Conference_Location
    Changchun
  • Print_ISBN
    978-1-4673-2963-7
  • Type

    conf

  • DOI
    10.1109/ICCSNT.2012.6526015
  • Filename
    6526015