DocumentCode :
167456
Title :
Infiniband-Verbs on GPU: A Case Study of Controlling an Infiniband Network Device from the GPU
Author :
Oden, Lena ; Froning, Holger ; Pfreundt, Franz-Joseph
Author_Institution :
Competence Center High Perfomance Comput., Fraunhofer Inst. for Ind. Math., Kaiserslautern, Germany
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
976
Lastpage :
983
Abstract :
Due to their massive parallelism and high performance per watt GPUs gain high popularity in high performance computing and are a strong candidate for future exacscale systems. But communication and data transfer in GPU accelerated systems remain a challenging problem. Since the GPU normally is not able to control a network device, today a hybrid-programming model is preferred, whereby the GPU is used for calculation and the CPU handles the communication. As a result, communication between distributed GPUs suffers from unnecessary overhead, introduced by switching control flow from GPUs to CPUs and vice versa. In this work, we modify user space libraries and device drivers of GPUs and the Infiniband network device in a way to enable the GPU to control an Infiniband network device to independently source and sink communication requests without any involvements of the CPU. Our performance analysis shows the differences to hybrid communication models in detail, in particular that the CPU´s advantage in generating work requests outshines the overhead associated with context switching. In other terms, our results show that complex networking protocols like IBVERBS are better handled by CPUs in spite of time penalties due to context switching, since overhead of work request generation cannot be parallelized and is not suitable with the high parallel programming model of GPUs.
Keywords :
graphics processing units; parallel programming; GPU; IBVERBS; data transfer; exacscale systems; high parallel programming model; high performance computing; high performance per watt; hybrid-programming model; infiniband network device; infiniband-verbs; massive parallelism; Context; Data transfer; Graphics processing units; Instruction sets; Libraries; Performance evaluation; Registers; Communication; GPUs; Heterogeneous Clusters; Infiniband; RDMA;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
Type :
conf
DOI :
10.1109/IPDPSW.2014.111
Filename :
6969487
Link To Document :
بازگشت