Designing an Offloaded Nonblocking MPI_Allgather Collective Using CORE-Direct

Author

Inozemtsev, Grigori ; Afsahi, Ahmad

Author_Institution

Dept. of Electr. & Comput. Eng., Queen´´s Univ., Kingston, ON, Canada

fYear

2012

fDate

24-28 Sept. 2012

Firstpage

477

Lastpage

485

Abstract

Collective communication operations in the Message Passing Interface (MPI) consume a significant amount of time at scale, degrading the performance of scientific applications. Optimizing collectives is key to application performance and scalability. This paper focuses on hiding the latency of the allgather collective by efficiently offloading it to the networking hardware. We have investigated the use of Mellanox CORE-Direct offloading technology for independent progression of communication within the collective in order to achieve high communication/computation overlap. This study evaluates several design options for the nonblocking allgather collective and discusses implementations of offloaded Standard Exchange, Ring and Bruck algorithms in flat and hierarchical communicators under single-port and k-port modelling. We have applied our findings to improving the performance of the redesigned Radix Sort application kernel. Performance results suggest that our offloaded nonblocking all gather compares favourably to the blocking variant (with improvements of up to 68% for medium messages in a hierarchical collective) while providing high overlap capability. Multiport modelling is shown to be beneficial, especially in a flat communicator. Radix Sort enjoys up to 40% improvement in its runtime.

Keywords

message passing; Bruck algorithm; Mellanox CORE-Direct offloading technology; Ring algorithm; collective communication operation; hierarchical collective; k-port modelling; message passing interface; multiport modelling; networking hardware; offloaded nonblocking MPI_allgather collective; offloaded standard exchange; radix sort application kernel; scientific application; single-port modelling; Algorithm design and analysis; Context; Kernel; Message systems; Protocols; Standards; MPI; allgather; collective communication; coredirect; message passing; offloading;

fLanguage

English

Publisher

ieee

Conference_Titel

Cluster Computing (CLUSTER), 2012 IEEE International Conference on

Conference_Location

Beijing

Print_ISBN

978-1-4673-2422-9

Type

conf

DOI

10.1109/CLUSTER.2012.75

Filename

6337811