Title :
Skew-Aware Task Scheduling in Clouds
Author :
Dongsheng Li ; Yixing Chen ; Hai, R.H.
Author_Institution :
Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
Data skew is an important reason for the emergence of stragglers in MapReduce-like cloud systems. In this paper, we propose a Skew-Aware Task Scheduling (SATS) mechanism for iterative applications in MapReduce-like systems. The mechanism utilizes the similarity of data distribution in adjacent iterations of iterative applications to reduce the straggle problem caused by data skew. It collects the data distribution information during the execution of tasks for the current iteration, and uses the information to guide data partitioning in tasks for the next iteration. We implement the mechanism in the HaLoop system and deploy it in a cluster. Experiments show that the proposed mechanism could deal with the data skew and improve the load balancing effectively.
Keywords :
cloud computing; iterative methods; resource allocation; scheduling; task analysis; HaLoop system; MapReduce-like cloud systems; MapReduce-like systems; SATS mechanism; data distribution information; data partitioning; data skew; iterative applications; load balancing; skew-aware task scheduling; Computational modeling; Data models; Data structures; Distributed databases; File systems; Load management; Processor scheduling; Cloud; Data Skew; Load balancing; Task Scheduling;
Conference_Titel :
Service Oriented System Engineering (SOSE), 2013 IEEE 7th International Symposium on
Conference_Location :
Redwood City
Print_ISBN :
978-1-4673-5659-6
DOI :
10.1109/SOSE.2013.64