DocumentCode :
687687
Title :
A novel decentralized asynchronous scheduler for Hadoop
Author :
Xiangming Dai ; Bensaou, Brahim
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
fYear :
2013
fDate :
9-13 Dec. 2013
Firstpage :
1470
Lastpage :
1475
Abstract :
In cloud computing systems, such as Hadoop, system performance is a significant target for improvement. In classic master node-central schedulers, decision is made in the heartbeat time scale, and idle slots during a heartbeat, remain idle until allocated a task by the master node. In this paper, we propose a novel scheduler named multiple queues scheduler (MQS) that improves the throughput of the system by increasing data locality rate of map tasks, reducing thereby the average completion time of jobs. To achieve this, we associate slave nodes with individual queues, and distribute the tasks of a job at arrival to those nodes that contain the associated input data, based on data locality. To reduce the load on overloaded slave nodes, task migration is performed asynchronously between nodes within a rack, without the intervention of the master node. Our results demonstrate the effectiveness of the proposed algorithm. The benefits of MQS are three-fold: first, it decreases the probability of allocating map tasks to non data-local nodes; second, it decreases the time wasted between heartbeats; these two aspects immediately improve the system performance; and third, it mitigates the stress on the master node by assigning part of the scheduler´s functions to slave nodes.
Keywords :
cloud computing; parallel programming; probability; scheduling; Hadoop; MQS scheduler; asynchronous task migration; average job completion time reduction; cloud computing systems; data locality; data locality rate; decentralized asynchronous scheduler; heartbeat time scale; input data; job task distribution; load reduction; map task allocation probability; master node-central schedulers; multiple queue scheduler; nondata-local nodes; overloaded slave nodes; scheduler functions; stress mitigation; system performance improvement; system throughput improvement; Hadoop; MapReduce; data locality; performance; scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Global Communications Conference (GLOBECOM), 2013 IEEE
Conference_Location :
Atlanta, GA
Type :
conf
DOI :
10.1109/GLOCOM.2013.6831281
Filename :
6831281
Link To Document :
بازگشت