DocumentCode
569492
Title
Communication and Memory Access Latency Characteristics of CPU/GPU Heterogeneous Cluster
Author
Wu, Yongwen ; Song, Junqiang ; Lu, Fengshun ; Yin, Fukang
Author_Institution
Coll. of Comput., Nat. Univ. of Defense Technol., Changsha, China
fYear
2012
fDate
17-19 Aug. 2012
Firstpage
958
Lastpage
961
Abstract
CPU/GPU heterogeneous computing embraces a rapid development in recent years. Considering that there are huge differences between CPU and GPU, CPU/GPU heterogeneous computing still faces many challenges. Therefore, collaborative features of fine-grained and coarse-grained parallelism are necessary to be explored in software designing. This paper takes a comprehensive study both on the CPU/GPU heterogeneous cluster´s hardware and program execution characteristics. After performing OSU Micro-Benchmark (OMB) test on the TH-1A system, we got the communication bandwidth of inter nodes, intra nodes and memory access latency results between CPU and GPU. Finally, we designed experiments to complete IS and FT benchmarks of NPB suite on TH-1A. The results showed that we can get desired results on CPU/GPU heterogeneous cluster when the problem was computation intensive and with relatively large problem scale. The results also provide practical principles for designing parallel computing model of CPU/GPU heterogeneous cluster in our future work.
Keywords
benchmark testing; graphics processing units; multiprocessing systems; parallel programming; storage management; CPU-GPU heterogeneous cluster; CPU-GPU heterogeneous computing; FT benchmarks; IS benchmarks; NPB suite; OMB test; OSU Micro-Benchmark test; TH-1A system; coarse-grained parallelism; communication latency characteristics; fine-grained parallelism; inter-node communication bandwidth; intra-node communication bandwidth; memory access latency characteristics; parallel computing model; program execution characteristics; software design; Bandwidth; Benchmark testing; Central Processing Unit; Computer architecture; Computers; Graphics processing unit; Parallel processing; CPU/GPU heterogeneous cluster; CUDA; Heterogeneous computing; NPB Benchmark;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational and Information Sciences (ICCIS), 2012 Fourth International Conference on
Conference_Location
Chongqing
Print_ISBN
978-1-4673-2406-9
Type
conf
DOI
10.1109/ICCIS.2012.104
Filename
6300767
Link To Document