Title :
A Throughput Driven Task Scheduler for Improving MapReduce Performance in Job-Intensive Environments
Author :
Xite Wang ; Derong Shen ; Ge Yu ; Tiezheng Nie ; Yue Kou
Author_Institution :
Coll. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
fDate :
June 27 2013-July 2 2013
Abstract :
MapReduce has been proven to be a highly desirable platform for scalable parallel data analysis. The task scheduling in MapReduce is very crucial for the job execution and has a marked impact on the system performance. To the best of our knowledge, the previous scheduling algorithms rarely consider the job-intensive environments and are not able to provide high system throughput. Hence this paper proposes a novel technique for job-intensive scheduling to improve the system throughput. Firstly, by making an in-depth analysis of job-intensive environments, we sum up 4 major factors which affect the system throughput. Secondly, based on the factors, an efficient technique, called throughput driven task scheduler is proposed, in which, we adopt a series of effective measures to improve the throughput of a MapReduce cluster system. Finally, plenty of simulation experiments are made and the experimental results show that the scheduler can provide higher throughput than the previous systems and is able to meet the requirements of practical job-intensive applications.
Keywords :
data analysis; parallel processing; pattern clustering; scheduling; MapReduce cluster system; MapReduce performance; job execution; job intensive environments; scalable parallel data analysis; throughput driven task scheduler; Data analysis; Data communication; Processor scheduling; Scheduling; System performance; Throughput; Upper bound; MapReduce; scheduling; throughput;
Conference_Titel :
Big Data (BigData Congress), 2013 IEEE International Congress on
Conference_Location :
Santa Clara, CA
Print_ISBN :
978-0-7695-5006-0
DOI :
10.1109/BigData.Congress.2013.36