DocumentCode :
3678340
Title :
Improving Strong-Scaling on GPU Cluster Based on Tightly Coupled Accelerators Architecture
Author :
Toshihiro Hanawa;Hisafumi Fujii;Norihisa Fujita;Tetsuya Odajima;Kazuya Matsumoto;Yuetsu Kodama;Taisuke Boku
Author_Institution :
Inf. Technol. Center, Univ. of Tokyo, Kashiwa, Japan
fYear :
2015
Firstpage :
88
Lastpage :
91
Abstract :
The Tightly Coupled Accelerators (TCA) architecture that we proposed in previous work enables direct communication between accelerators over nodes. In this paper, we present a proof-of-concept GPU cluster called the HA-PACS/TCA using the PEACH2 chip that we designed as an interconnection router chip based on the TCA architecture. Our system demonstrated 2.0 ?sec of latency on inter-node GPU-to-GPU communication with a PCIe Gen2 x8 by RDMA, reducing minimum latency to just 44% of the InfiniBand-QDR and MPI using GPUDirect for RDMA. Through results of Himeno benchmark tests, we demonstrated that our TCA architecture improved performance scalability with the small-sized problem by up to 61%.
Keywords :
"Graphics processing units","Computer architecture","Benchmark testing","Bandwidth","Sockets","Conferences","Picture archiving and communication systems"
Publisher :
ieee
Conference_Titel :
Cluster Computing (CLUSTER), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/CLUSTER.2015.154
Filename :
7307569
Link To Document :
بازگشت