A Dynamic Accelerator-Cluster Architecture

Author

Rinke, Sebastian ; Becker, Daniel ; Lippert, Thomas ; Prabhakaran, Suraj ; Westphal, Lidia ; Wolf, Felix

Author_Institution

Lab. for Parallel Program., German Res. Sch. for Simulation Sci., Aachen, Germany

fYear

2012

fDate

10-13 Sept. 2012

Firstpage

357

Lastpage

366

Abstract

Accelerators such as graphics processing units (GPUs) provide an inexpensive way of improving the performance of cluster systems. In such an arrangement, the individual nodes of the cluster are directly connected to one or more accelerator devices via PCI Express. This results in a static mapping of accelerators onto compute nodes, where each accelerator can only be accessed from exactly one compute node. While this static mapping enables efficient data transfers between a given accelerator and the compute node it belongs to, differing computational demands across jobs may, however, produce either underutilized accelerators or nodes whose computational demands cannot be satisfied with the number of accelerators available to them. In particular, smaller numbers of GPUs available per node may enforce explicit MPI parallelism across compute nodes where it is not necessary. To address this limitation, we propose a novel accelerator-cluster architecture in which network-attached accelerators are dynamically assigned to compute nodes. This allows not only their optimal utilization but also a more precise match between application requirements and accelerator hardware. We outline the general concept of our dynamic architecture and show that it can offer substantial benefits to certain classes of applications without significantly harming the performance of others.

Keywords

graphics processing units; message passing; parallel architectures; peripheral interfaces; CUDA; MPI parallelism; PCI Express; accelerator device; computational demand; data transfer; dynamic accelerator-cluster architecture; dynamic architecture; graphics processing unit; network-attached accelerator; performance improvement; static mapping; Acceleration; Computational modeling; Computer architecture; Graphics processing unit; Hardware; Kernel; Servers; accelerator; assignment; cluster; dynamic; resource management;

fLanguage

English

Publisher

ieee

Conference_Titel

Parallel Processing Workshops (ICPPW), 2012 41st International Conference on

Conference_Location

Pittsburgh, PA

ISSN

1530-2016

Print_ISBN

978-1-4673-2509-7

Type

conf

DOI

10.1109/ICPPW.2012.52

Filename

6337502