Title :
Automatic Mapping Single-Device OpenCL Program to Heterogeneous Multi-device Platform
Author :
Dong Chen ; Changqing Xun ; Dafei Huang ; Mei Wen ; Chunyuan Zhang
Author_Institution :
Nat. Key Lab. of Parallel & Distrib. Process., Nat. Univ. of Defense Technol. Changsha, Changsha, China
Abstract :
In this paper, we propose a framework to automatically map single-device OpenCL programs to heterogeneous multi-device platforms with performance concerns. Our framework is based on the independence of work groups which built inside the OpenCL programming model and relies heavily on the knowledge of global memory access regions of work groups. So global memory access patterns of work groups are analyzed and an abstract representation CCRwS is designed to describe the exact memory access regions of each memory access statement in the kernels. A global memory access analyzer is designed to get CCRwSs by performing static program analysis on kernel codes. Based on CCRwSs, data transfer between multiple devices and host can be fully controlled by our framework. Then a kernel code regenerator is designed to distribute the workload and perform architecture specific optimizations by code transformation. Then we tested our framework on a platform with 2 Intel E5-2650 CPUs and 4 NVIDIA Tesla C2050 GPUs. Compared with the performance on single GPU, the kernels running on all the 6 devices can achieve about 4.5x faster.
Keywords :
graphics processing units; parallel programming; program diagnostics; Intel E5-2650 CPU; NVIDIA Tesla C2050 GPU; OpenCL programming model; abstract representation CCRwS; automatic mapping single-device OpenCL program; code transformation; data transfer; global memory access patterns; global memory access regions; heterogeneous multi-device platform; kernel codes; multiple devices; static program analysis; Abstracts; Benchmark testing; Computer architecture; Indexes; Kernel; Optimization; Performance evaluation; Automatic; Code transformation; Performance; multi-device;
Conference_Titel :
High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
Conference_Location :
Zhangjiajie
DOI :
10.1109/HPCC.and.EUC.2013.28