Title :
A comparative review of job scheduling for MapReduce
Author :
Yoo, Dongjin ; Sim, Kwang Mong
Author_Institution :
Multi-Agent & Cloud Comput. Syst. Lab., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju, South Korea
Abstract :
MapReduce is an emerging paradigm for data intensive processing with support of cloud computing technology. MapReduce provides convenient programming interfaces to distribute data intensive works in a cluster environment. The strengths of MapReduce are fault tolerance, an easy programming structure and high scalability. A variety of applications have adopted MapReduce including scientific analysis, web data processing and high performance computing. Data Intensive computing systems, such as Hadoop and Dryad, should provide an efficient scheduling mechanism for enhanced utilization in a shared cluster environment. The problems of scheduling map-reduce jobs are mostly caused by locality and synchronization overhead. Also, there is a need to schedule multiple jobs in a shared cluster with fairness constraints. By introducing the scheduling problems with regards to locality, synchronization and fairness constraints, this paper reviews a collection of scheduling methods for handling these issues in MapReduce. In addition, this paper compares different scheduling methods evaluating their features, strengths and weaknesses. For resolving synchronization overhead, two categories of studies; asynchronous processing and speculative execution are discussed. For fairness constraints with locality improvement, delay scheduling in Hadoop and Quincy scheduler in Dryad are discussed.
Keywords :
cloud computing; scheduling; Dryad; Hadoop; MapReduce; Quincy scheduler; Web data processing; asynchronous processing; cloud computing technology; data intensive computing systems; delay scheduling; distribute data intensive works; high performance computing; job scheduling; programming interfaces; scientific analysis; speculative execution; Bandwidth; Data processing; Delay; Optimization; Processor scheduling; Scheduling; Synchronization; Fairness; Job Scheduling; Locality; MapReduce;
Conference_Titel :
Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-61284-203-5
DOI :
10.1109/CCIS.2011.6045089