Title :
Collocating CPU-only Jobs with GPU-assisted Jobs on GPU-assisted HPC
Author :
Jiadong Wu ; Bo Hong
Author_Institution :
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
In recent years, GPU has evolved rapidly and exhibited great potential in accelerating scientific applications. Massive GPU-assisted HPC systems have been deployed. However, as a heterogeneous system, GPU-assisted HPC is harder to be programmed and utilized than conventional CPU-only system. Statistics of the Keene land system indicate that the effective utilization rate of computational resources is only about 40% when the system runs in normal condition with enough jobs in its queue. Our theoretical model shows that the lack of overlap between CPU/GPU computation is a major obstacle in the efficient utilization of heterogeneous system. In this paper, we evaluate the possibility of collocating CPU-only job with GPU-assisted job on the same node to increase overlap between CPU/GPU computation, thus achieving better utilization. Several performance compromising factors, such as resource isolation, CPU load, and GPU memory demands, are studied based on workload from popular MPI/CUDA benchmarks. The results indicate that, when those factors are managed properly, the collocated CPU-only job can efficiently scavenge the underutilized CPU resource without affecting the performance of both collocated jobs. Based on this insight, an experimental system with collocation-aware job scheduler and resource manager is proposed. With our experiment workload pool of mixed CPU and GPU jobs, the system demonstrates 15% gain in throughput and 10% gain in both CPU and GPU utilization.
Keywords :
graphics processing units; message passing; natural sciences computing; parallel architectures; parallel processing; resource allocation; scheduling; statistical analysis; CPU-GPU computation; CPU-only job collocation; GPU-assisted HPC; GPU-assisted jobs; Keeneland system statistics; MPI-CUDA benchmarks; collocation-aware job scheduler; computational resources utilization rate; resource manager; scientific applications; Bandwidth; Benchmark testing; Central Processing Unit; Computational modeling; Graphics processing units; Peer-to-peer computing; Throughput;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
Conference_Location :
Delft
Print_ISBN :
978-1-4673-6465-2
DOI :
10.1109/CCGrid.2013.19