Title :
Message passing on data-parallel architectures
Author :
Stuart, Jeff A. ; Owens, John D.
Author_Institution :
Dept. of Comput. Sci., Univ. of California, Davis, CA, USA
Abstract :
This paper explores the challenges in implementing a message passing interface usable on systems with data-parallel processors. As a case study, we design and implement the ldquoDCGNrdquo API on NVIDIA GPUs that is similar to MPI and allows full access to the underlying architecture. We introduce the notion of data-parallel thread-groups as a way to map resources to MPI ranks. We use a method that also allows the data-parallel processors to run autonomously from user-written CPU code. In order to facilitate communication, we use a sleep-based polling system to store and retrieve messages. Unlike previous systems, our method provides both performance and flexibility. By running a test suite of applications with different communication requirements, we find that a tolerable amount of overhead is incurred, somewhere between one and five percent depending on the application, and indicate the locations where this overhead accumulates. We conclude that with innovations in chipsets and drivers, this overhead will be mitigated and provide similar performance to typical CPU-based MPI implementations while providing fully-dynamic communication.
Keywords :
message passing; parallel processing; DCGN API; NVIDIA GPU; data-parallel architecture; data-parallel thread-group; message passing interface; sleep-based polling system; Computer architecture; Computer science; Coprocessors; Distributed computing; Message passing; Performance loss; Technological innovation; Testing; Workstations; Yarn;
Conference_Titel :
Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
Conference_Location :
Rome
Print_ISBN :
978-1-4244-3751-1
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2009.5161065