• DocumentCode
    3006642
  • Title

    A Throughput Driven Task Scheduler for Improving MapReduce Performance in Job-Intensive Environments

  • Author

    Xite Wang ; Derong Shen ; Ge Yu ; Tiezheng Nie ; Yue Kou

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
  • fYear
    2013
  • fDate
    June 27 2013-July 2 2013
  • Firstpage
    211
  • Lastpage
    218
  • Abstract
    MapReduce has been proven to be a highly desirable platform for scalable parallel data analysis. The task scheduling in MapReduce is very crucial for the job execution and has a marked impact on the system performance. To the best of our knowledge, the previous scheduling algorithms rarely consider the job-intensive environments and are not able to provide high system throughput. Hence this paper proposes a novel technique for job-intensive scheduling to improve the system throughput. Firstly, by making an in-depth analysis of job-intensive environments, we sum up 4 major factors which affect the system throughput. Secondly, based on the factors, an efficient technique, called throughput driven task scheduler is proposed, in which, we adopt a series of effective measures to improve the throughput of a MapReduce cluster system. Finally, plenty of simulation experiments are made and the experimental results show that the scheduler can provide higher throughput than the previous systems and is able to meet the requirements of practical job-intensive applications.
  • Keywords
    data analysis; parallel processing; pattern clustering; scheduling; MapReduce cluster system; MapReduce performance; job execution; job intensive environments; scalable parallel data analysis; throughput driven task scheduler; Data analysis; Data communication; Processor scheduling; Scheduling; System performance; Throughput; Upper bound; MapReduce; scheduling; throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (BigData Congress), 2013 IEEE International Congress on
  • Conference_Location
    Santa Clara, CA
  • Print_ISBN
    978-0-7695-5006-0
  • Type

    conf

  • DOI
    10.1109/BigData.Congress.2013.36
  • Filename
    6597139