DocumentCode
3740663
Title
Collective Offload for Heterogeneous Clusters
Author
Florentino Sainz; Bell?n;Vicen? ;Jes?s
Author_Institution
Barcelona Supercomput. Center, Barcelona, Spain
fYear
2015
Firstpage
376
Lastpage
385
Abstract
Exascale performance requires a level of energy efficiency only achievable with specialized hardware. Hence, for building a general purpose HPC system with Exascale performance different types of processors, memory technologies and interconnection networks will be necessary. Heterogeneous hardware is already present on some top supercomputer systems that are composed of different compute nodes, which at the same time, contain different types of processors and memories. Moreover, heterogeneous hardware is much harder to manage and exploit than homogeneous hardware, further increasing the complexity of applications that run on HPC systems. Most HPC applications use MPI to implement a rigid Single Program Multiple Data (SPMD) execution model that no longer fits the heterogeneous nature of the underlying hardware. However, MPI provides a powerful and flexible MPI_Comm_spawn API call that was designed to exploit heterogeneous hardware dynamically but at the expense of higher complexity, hindering a wider adoption of this API. In this paper, we have extended the OmpSs programming model to offload MPI kernels dynamically, replacing the low-level and more error-prone MPI_Comm_ spawn call with high-level and easier to use OmpSs pragmas. The evaluation shows that our proposal simplifies the dynamic offload of MPI kernels while keeping competitive performance and scaling to a high number of nodes.
Keywords
"Graphics processing units","Hardware","Kernel","Programming","Resource management","Complexity theory"
Publisher
ieee
Conference_Titel
High Performance Computing (HiPC), 2015 IEEE 22nd International Conference on
Type
conf
DOI
10.1109/HiPC.2015.20
Filename
7397653
Link To Document