DocumentCode :
258611
Title :
A unified OpenCL-flavor programming model with scalable hybrid hardware platform on FPGAs
Author :
Hongyuan Ding ; Miaoqing Huang
Author_Institution :
Dept. of Comput. Sci. & Comput. Eng., Univ. of Arkansas, Fayetteville, AR, USA
fYear :
2014
fDate :
8-10 Dec. 2014
Firstpage :
1
Lastpage :
7
Abstract :
Hardware accelerators are capable of achieving significant performance improvement. However, designing hardware accelerators lacks the flexibility and the productivity. Combining hardware accelerators with multiprocessor system-on-chip (MPSoC) is an alternative way to balance the flexibility, the productivity, and the performance. In this work, we present a unified hybrid OpenCL-flavor (HOpenCL) parallel programming model on MPSoC supporting both hardware and software kernels. By integrating the HOpenCL hardware IPs and software libraries, the same kernel function can execute as either hardware kernels on the dedicated hardware accelerators or software kernels on the general-purpose processors. Using the automatic design flow, the corresponding hybrid hardware platform is generated along with the executable. We use the matrix multiplication of 512×512 to examine the potential of our hybrid system in terms of performance, scalability, and productivity. The results show that hardware kernels reach more than 10 times speedup compared with the software kernels. Our prototype platform also demonstrates a good performance scalability when the number of group computation units (GCUs) increases from 1 to 6 until it becomes a memory bound problem. Compared with the hard ARM core on the Zynq 7045 device, we find that the performance of one ARM core is equivalent to 2 or 3 GCUs with software kernel implementations. On the other hand, a single GCU with hardware kernel implementation is 5 times faster than the ARM core.
Keywords :
field programmable gate arrays; matrix multiplication; parallel programming; ARM core; FPGA; HOpenCL parallel programming model; MPSoC; OpenCL-flavor programming model; Zynq 7045 device; field programmable gate array; general-purpose processors; group computation unit; hardware accelerators; hardware kernel; kernel function; matrix multiplication; memory bound problem; multiprocessor system-on-chip; scalable hybrid hardware platform; software kernel; Field programmable gate arrays; Hardware; Kernel; Program processors; Programming; System-on-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
ReConFigurable Computing and FPGAs (ReConFig), 2014 International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4799-5943-3
Type :
conf
DOI :
10.1109/ReConFig.2014.7032563
Filename :
7032563
Link To Document :
بازگشت