Title :
MARS: Scheduling non-local tasks in mapreduce
Author :
Mingxing Tang ; Changjian Wang ; Yuxing Peng
Author_Institution :
Coll. of Comput., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
Data locality is one of most important principles in MapReduce and lots of efforts have been devoted to it. However, there often exist some tasks in MapReduce, named non-local tasks, which need access remote data. The existing scheduler in MapReduce provides a simple strategy to schedule the non-local tasks, it not only takes no optimization into account but also may result in more non-local tasks. To address these problems, we design a new non-local task scheduling approach for MapReduce, named MARS. A task selection algorithm is proposed to choose a proper non-local tasks from multiple candidate tasks before scheduling and an overlapped schedule algorithm is proposed to optimize the time for a task to access remote data. Based on the above work, a new scheduling mechanism for non-local tasks is designed and implemented in MapReduce. Comprehensive experiments have been performed to verify the effectiveness of MARS. The results show that MARS can reduce Map phase runtime by 25% and achieve a better data locality than native Hadoop.
Keywords :
data handling; feature selection; optimisation; parallel processing; scheduling; MARS; MapReduce; data locality; nonlocal task scheduling; task selection algorithm; time optimization; Mars; Prefetching; Scheduling algorithms; MARS; MapReduce; Non-local tasks; overlap;
Conference_Titel :
Cloud Computing and Intelligence Systems (CCIS), 2014 IEEE 3rd International Conference on
Print_ISBN :
978-1-4799-4720-1
DOI :
10.1109/CCIS.2014.7175794