DocumentCode
3591171
Title
A flexible scheduling framework for heterogeneous CPU-GPU clusters
Author
Sajjapongse, Kittisak ; Agarwal, Tejaswi ; Becchi, Michela
fYear
2014
Firstpage
1
Lastpage
11
Abstract
In the last few years, thanks to their computational power and progressively increased programmability, GPUs have become part of HPC clusters. As a result, widely used open-source cluster resource managers (e.g. TORQUE and SLURM) have recently been extended with GPU support capabilities. These systems, however, treat GPUs as dedicated resources and provide scheduling mechanisms that often result in resource underutilization and, thereby, in suboptimal performance. We propose a cluster-level scheduler and integrate it with our previously proposed node-level GPU virtualization runtime [1, 2], thus providing a hierarchical cluster resource management framework that allows the efficient use of heterogeneous CPU-GPU clusters. The scheduling policy used by our system is configurable, and our scheduler provides administrators with a high-level API that allows easily defining custom scheduling policies. We provide two application- and hardware-heterogeneity-aware cluster-level scheduling schemes for hybrid MPI-CUDA applications: co-location- and latency-reduction-based scheduling, and use them in combination with a preemption-based GPU sharing policy implemented at the node-level. We validate our framework on two heterogeneous clusters: one consisting of commodity workstations and the other of high-end nodes with various hardware configurations, and on a mix of communication- and compute-intensive applications. Our experiments show that, by better utilizing the available resources, our scheduling framework outperforms existing batch-schedulers both in terms of throughput and application latency.
Keywords
application program interfaces; graphics processing units; message passing; parallel architectures; scheduling; cluster-level scheduler; co-location-based scheduling; flexible scheduling framework; heterogeneous CPU-GPU clusters; high-level API; hybrid MPI-CUDA applications; latency-reduction-based scheduling; open-source cluster resource managers; preemption-based GPU sharing policy; Computer architecture; Graphics processing units; Libraries; Optimal scheduling; Processor scheduling; Runtime; Torque; GPU; HPC clusters; runtime design; scheduling;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing (HiPC), 2014 21st International Conference on
Print_ISBN
978-1-4799-5975-4
Type
conf
DOI
10.1109/HiPC.2014.7116892
Filename
7116892
Link To Document