Title :
Notified Access: Extending Remote Memory Access Programming Models for Producer-Consumer Synchronization
Author :
Belli, Roberto ; Hoefler, Torsten
Abstract :
Remote Memory Access (RMA) programming enables direct access to low-level hardware features to achieve high performance for distributed-memory programs. However, the design of RMA programming schemes focuses on the memory access and less on the synchronization. For example, in contemporary RMA programming systems, the widely used producer-consumer pattern can only be implemented inefficiently, incurring in an overhead of an additional round-trip message. We propose Notified Access, a scheme where the target process of an access can receive a completion notification. This scheme enables direct and efficient synchronization with a minimum number of messages. We implement our scheme in an open source MPI-3 RMA library and demonstrate lower overheads (two cache misses) than other point-to-point synchronization mechanisms for each notification. We also evaluate our implementation on three real-world benchmarks, a stencil computation, a tree computation, and a Colicky factorization implemented with tasks. Our scheme always performs better than traditional message passing and other existing RMA synchronization schemes, providing up to 50% speedup on small messages. Our analysis shows that Notified Access is a valuable primitive for any RMA system. Furthermore, we provide guidance for the design of low-level network interfaces to support Notified Access efficiently.
Keywords :
application program interfaces; distributed memory systems; file organisation; message passing; network interfaces; parallel programming; public domain software; synchronisation; Cholesky factorization; RMA programming scheme; RMA synchronization scheme; distributed-memory program; low-level network interface; message passing; notified access; open source MPI-3 RMA library; point-to-point synchronization mechanism; producer-consumer pattern; producer-consumer synchronization; remote memory access programming model; round-trip message; tree computation; Computational modeling; Hardware; Message passing; Programming; Protocols; Semantics; Synchronization; MPI; RMA; notification; synchronization;
Conference_Titel :
Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
Conference_Location :
Hyderabad
DOI :
10.1109/IPDPS.2015.30