• DocumentCode
    688178
  • Title

    Analysis and Improvement of Makespan and Utilization for MapReduce

  • Author

    Yin Li ; Chuang Lin ; Fengyuan Ren

  • Author_Institution
    Tsinghua Nat. Lab. for Inf. Sci. & Technol. (TNList), Tsinghua Univ., Beijing, China
  • fYear
    2013
  • fDate
    13-15 Nov. 2013
  • Firstpage
    434
  • Lastpage
    441
  • Abstract
    A MapReduce cluster is usually shared by multiple users or products, aiming at accelerating their own job. In contrast, the utilization of the cluster is the main concern for the system itself. MapReduce jobs are split into independent tasks during execution. However, in practice, the number of tasks per stage for each job and the system settings are often sub-optimal to support certain workload. As far as we know, few research focused on the impacts of influential factors such as task granularity, slot count and workload, and the performance analysis from the perspectives of each job and system. We construct models to describe the effects of these factors on the performance from both the per-job and system perspectives. Based on the understanding provided by analytical model, we discuss the optimal settings of job and system parameters, then propose a batch scheduling policy instead of FIFO, so as to improve the system utilization and reduce the average job processing time. The simulation results show that the average gap time is reduced by 10% and the mean job make span is improved by 15% when each job is preempted 6 times on average using the batch scheduling policy.
  • Keywords
    data handling; pattern clustering; scheduling; FIFO; MapReduce cluster; average job processing time reduction; batch scheduling policy; cluster utilization; makespan analysis; makespan improvement; performance analysis; slot count; system utilization; task granularity; workload; Aggregates; Analytical models; Biological system modeling; Conferences; Mathematical model; Performance analysis; Simulation; MapReduce; makespan; performance; utilization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
  • Conference_Location
    Zhangjiajie
  • Type

    conf

  • DOI
    10.1109/HPCC.and.EUC.2013.69
  • Filename
    6831951