DocumentCode :
1914344
Title :
A Dispatching-Rule-Based Task Scheduling Policy for MapReduce with Multi-type Jobs in Heterogeneous Environments
Author :
Gao, Xiang ; Chen, Qing ; Chen, Yurong ; Sun, Qingwei ; Liu, Yan ; Li, Mingzhu
Author_Institution :
China Mobile Commun. Corp. (CMCC), Beijing, China
fYear :
2012
fDate :
20-23 Sept. 2012
Firstpage :
17
Lastpage :
24
Abstract :
MapReduce has emerged as an important and widely used programming model for distributed and parallel computing, due to its ease of use, generality and scalability. This model is proposed to mainly solve large-scale data processing, i.e. data-intensive jobs, and it is optimized for homogenous environment, in which computing nodes are identical and dedicated. Today enterprise IT systems preserve massive, historical management and operational data, which need both data-intensive and computation-intensive analysis while using heterogeneous computing resources. In order to support enterprise data analysis application with the MapReduce model, it is important to improve MapReduce´s task scheduling algorithm that can reduce the overall completion time with multi-type jobs and in heterogeneous environments. This paper formulates the scheduling problem as an optimization problem. Based on the job shop scheduling theory and existing approximation algorithms, we propose a new dispatching-rule-based and online scheduling policy LPT-θ. By using LPT-θ, the tasks with larger processing time and within a θ-space would be assigned with higher priorities. Numerical results show that LPT-θ can achieve a 12%~45% performance gain compared with the original scheduling algorithm in MapReduce.
Keywords :
approximation theory; business data processing; data analysis; dispatching; job shop scheduling; optimisation; parallel processing; MapReduce model; MapReduce task scheduling algorithm; approximation algorithms; computation-intensive analysis; data-intensive jobs; dispatching-rule-based task scheduling policy; distributed computing; enterprise IT systems; enterprise data analysis application; heterogeneous computing resources; heterogeneous environments; homogenous environment; job shop scheduling theory; large-scale data processing; multitype jobs; online scheduling policy LPT-θ; optimization problem; parallel computing; Algorithm design and analysis; Approximation algorithms; Job shop scheduling; Scheduling algorithms; Cloud Computing; Distributed and Parallel Computing; Job Shop Scheduling Problem; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
ChinaGrid Annual Conference (ChinaGrid), 2012 Seventh
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2623-0
Electronic_ISBN :
978-0-7695-4816-6
Type :
conf
DOI :
10.1109/ChinaGrid.2012.27
Filename :
6337310
Link To Document :
بازگشت