DocumentCode :
1783241
Title :
A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute Clusters
Author :
Coviello, Giuseppe ; Cadambi, Srihari ; Chakradhar, Srimat
Author_Institution :
NEC Labs. America, Inc., Princeton, NJ, USA
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
337
Lastpage :
346
Abstract :
We propose a cluster scheduling technique for compute clusters with Xeon Phi coprocessors. Even though the Xeon Phi runs Linux which allows multiprocessing, cluster schedulers generally do not allow jobs to share coprocessors because sharing can cause oversubscription of coprocessor memory and thread resources. It has been shown that memory or thread oversubscription on a many core like the Phi results in job crashes or drastic performance loss. We first show that such an exclusive device allocation policy causes severe coprocessor underutilization: for typical workloads, on average only 38% of the Xeon Phi cores are busy across the cluster. Then, to improve coprocessor utilization, we propose a scheduling technique that enables safe coprocessor sharing without resource oversubscription. Jobs specify their maximum memory and thread requirements, and our scheduler packs as many jobs as possible on each coprocessor in the cluster, subject to resource limits. We solve this problem using a greedy approach at the cluster level combined with a knapsack-based algorithm for each node. Every coprocessor is modeled as a knapsack and jobs are packed into each knapsack with the goal of maximizing job concurrency, i.e., as many jobs as possible executing on each coprocessor. Given a set of jobs, we show that this strategy of packing for high concurrency is a good proxy for (i) reducing make span, without the need for users to specify job execution times and (ii) reducing coprocessor footprint, or the number of coprocessors required to finish the jobs without increasing make span. We implement the entire system as a seamless add on to Condor, a popular distributed job scheduler, and show make span and footprint reductions of more than 50% across a wide range of workloads.
Keywords :
coprocessors; greedy algorithms; multiprocessing systems; pattern clustering; processor scheduling; Condor; Linux; Xeon Phi-based compute clusters; cluster scheduling technique; coprocessor footprint reduction; coprocessor memory oversubscription; coprocessor sharing-aware scheduler; coprocessor underutilization; coprocessor utilization; distributed job scheduler; exclusive device allocation policy; greedy approach; job concurrency maximization; knapsack-based algorithm; multiprocessing; performance loss; thread oversubscription; thread resources; Concurrent computing; Coprocessors; Hardware; Instruction sets; Linux; Memory management; Servers; Middleware; coprocessors; high performance computing; processor scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
Conference_Location :
Phoenix, AZ
ISSN :
1530-2075
Print_ISBN :
978-1-4799-3799-8
Type :
conf
DOI :
10.1109/IPDPS.2014.44
Filename :
6877268
Link To Document :
بازگشت