Title :
A distribution aware scheduling method in MapReduce
Author :
Zhang, Xiaohong ; Ding, Yang
Author_Institution :
Sch. of Comput. Sci. & Technol., Henan Polytech. Univ., Jiaozuo, China
Abstract :
Data locality is one of the critical factors which affect the system performance. In this paper, we focus on the data locality problem in Hadoop MapReduce. To improve the data locality of MapReduce, we propose a scheduling method. After receiving a request from a node, the method selects a task from the first level followed by the second and the third level of the node. Then, it checks whether the task is the only one on the first level of the node to issue a request. If so, the method skips the selected task, and selects another task for the node issuing a request. Otherwise, the method schedules the selected task to the node. We have analyzed the method. Comparing with default scheduling method of Hadoop MapReduce, the proposed method can improve the efficiency of data locality.
Keywords :
Internet; data handling; public domain software; scheduling; software performance evaluation; Hadoop MapReduce; Internet technologies; data locality problem; distribution aware scheduling method; system performance; Educational institutions; Nonhomogeneous media; Data intesnsive applications; Data locality; MapReduce; Scheduling;
Conference_Titel :
Electrical & Electronics Engineering (EEESYM), 2012 IEEE Symposium on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4673-2363-5
DOI :
10.1109/EEESym.2012.6258605