مرکز منطقه ای اطلاع رساني علوم و فناوري - A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute Clusters

DocumentCode :

1783241

Title :

A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute Clusters

Author :

Coviello, Giuseppe ; Cadambi, Srihari ; Chakradhar, Srimat

Author_Institution :

NEC Labs. America, Inc., Princeton, NJ, USA

fYear :

2014

fDate :

19-23 May 2014

Firstpage :

337

Lastpage :

346

Abstract :

We propose a cluster scheduling technique for compute clusters with Xeon Phi coprocessors. Even though the Xeon Phi runs Linux which allows multiprocessing, cluster schedulers generally do not allow jobs to share coprocessors because sharing can cause oversubscription of coprocessor memory and thread resources. It has been shown that memory or thread oversubscription on a many core like the Phi results in job crashes or drastic performance loss. We first show that such an exclusive device allocation policy causes severe coprocessor underutilization: for typical workloads, on average only 38% of the Xeon Phi cores are busy across the cluster. Then, to improve coprocessor utilization, we propose a scheduling technique that enables safe coprocessor sharing without resource oversubscription. Jobs specify their maximum memory and thread requirements, and our scheduler packs as many jobs as possible on each coprocessor in the cluster, subject to resource limits. We solve this problem using a greedy approach at the cluster level combined with a knapsack-based algorithm for each node. Every coprocessor is modeled as a knapsack and jobs are packed into each knapsack with the goal of maximizing job concurrency, i.e., as many jobs as possible executing on each coprocessor. Given a set of jobs, we show that this strategy of packing for high concurrency is a good proxy for (i) reducing make span, without the need for users to specify job execution times and (ii) reducing coprocessor footprint, or the number of coprocessors required to finish the jobs without increasing make span. We implement the entire system as a seamless add on to Condor, a popular distributed job scheduler, and show make span and footprint reductions of more than 50% across a wide range of workloads.

Keywords :

coprocessors; greedy algorithms; multiprocessing systems; pattern clustering; processor scheduling; Condor; Linux; Xeon Phi-based compute clusters; cluster scheduling technique; coprocessor footprint reduction; coprocessor memory oversubscription; coprocessor sharing-aware scheduler; coprocessor underutilization; coprocessor utilization; distributed job scheduler; exclusive device allocation policy; greedy approach; job concurrency maximization; knapsack-based algorithm; multiprocessing; performance loss; thread oversubscription; thread resources; Concurrent computing; Coprocessors; Hardware; Instruction sets; Linux; Memory management; Servers; Middleware; coprocessors; high performance computing; processor scheduling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium, 2014 IEEE 28th International

Conference_Location :

Phoenix, AZ

ISSN :

1530-2075

Print_ISBN :

978-1-4799-3799-8

Type :

conf

DOI :

10.1109/IPDPS.2014.44

Filename :

6877268

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1783241