مرکز منطقه ای اطلاع رساني علوم و فناوري - Efficient Intranode Communication in GPU-Accelerated Systems

DocumentCode :

3001479

Title :

Efficient Intranode Communication in GPU-Accelerated Systems

Author :

Ji, Feng ; Aji, Ashwin M. ; Dinan, James ; Buntinas, Darius ; Balaji, Pavan ; Feng, Wu-chun ; Ma, Xiaosong

fYear :

2012

fDate :

21-25 May 2012

Firstpage :

1838

Lastpage :

1847

Abstract :

Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) and require programmers to explicitly move data between memory spaces. This approach is inefficient, especially for intranode communication where it can result in several extra copy operations. In this work, we integrate GPU-awareness into a popular MPI runtime system and develop techniques to significantly reduce the cost of intranode communication involving one or more GPUs. Experiment results show an up to 2x increase in bandwidth, resulting in an average of 4.3% improvement to the total execution time of a halo exchange benchmark.

Keywords :

graphics processing units; message passing; GPU device memory; GPU-accelerated systems; MPI runtime system; accelerator memory; intranode communication; Bandwidth; Computer architecture; Graphics processing unit; Manuals; Performance evaluation; Programming; Receivers; CUDA; GPU; Intranode communication; MPI; MPICH2; Nemesis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International

Conference_Location :

Shanghai

Print_ISBN :

978-1-4673-0974-5

Type :

conf

DOI :

10.1109/IPDPSW.2012.227

Filename :

6270862

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3001479