DocumentCode :
611020
Title :
Evaluation of Inter- and Intra-node Data Transfer Efficiencies between GPU Devices and their Impact on Scalable Applications
Author :
Pena, A.J. ; Alam, S.R.
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. Jaume I, Castellon de la Plana, Spain
fYear :
2013
fDate :
13-16 May 2013
Firstpage :
144
Lastpage :
151
Abstract :
Data movement is of high relevance for GPU Computing. Communication and performance efficiencies of applications and systems with GPU accelerators depend on on- and off-node data paths, thereby making tuning and optimization an increasingly complex task. In this paper we conduct an in-depth study to establish the parameters that influence performance of data transfers between on-node GPU devices, and located on separate nodes (off-node). We compare the most recent version of MVAPICH2 featuring seamless remote GPU transfers with our own low-level benchmarks, and discuss the bottlenecks that may arise. Data path performance and bottlenecks between GPU devices are analyzed and compared for two substantially different systems: an IBM datable relying on an InfiniBand QDR fabric with two on-node GPU devices, and a Cray XK6, featuring a single GPU per node, and connected through a Gemini interconnect. Finally, we adapt LAMMPS, a GPU-accelerated application, to benefit from efficient inter-GPU data transfers, and validate our findings.
Keywords :
benchmark testing; data handling; electronic data interchange; graphics processing units; parallel processing; Cray XK6; GPU accelerators; GPU computing; GPU-accelerated application; Gemini interconnection; IBM iDataPlex; InfiniBand QDR fabric; LAMMPS; MVAPICH2; data movement; data path performance; interGPU data transfers; internode data transfer efficiencies; intranode data transfer efficiencies; low-level benchmarks; off-node data paths; on-node GPU devices; remote GPU transfers; Benchmark testing; Data transfer; Graphics processing units; Libraries; Memory management; Pipelines; Throughput; GPU computing; cluster computing; high performance computing; performance evaluation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
Conference_Location :
Delft
Print_ISBN :
978-1-4673-6465-2
Type :
conf
DOI :
10.1109/CCGrid.2013.15
Filename :
6546072
Link To Document :
بازگشت