DocumentCode
258611
Title
A unified OpenCL-flavor programming model with scalable hybrid hardware platform on FPGAs
Author
Hongyuan Ding ; Miaoqing Huang
Author_Institution
Dept. of Comput. Sci. & Comput. Eng., Univ. of Arkansas, Fayetteville, AR, USA
fYear
2014
fDate
8-10 Dec. 2014
Firstpage
1
Lastpage
7
Abstract
Hardware accelerators are capable of achieving significant performance improvement. However, designing hardware accelerators lacks the flexibility and the productivity. Combining hardware accelerators with multiprocessor system-on-chip (MPSoC) is an alternative way to balance the flexibility, the productivity, and the performance. In this work, we present a unified hybrid OpenCL-flavor (HOpenCL) parallel programming model on MPSoC supporting both hardware and software kernels. By integrating the HOpenCL hardware IPs and software libraries, the same kernel function can execute as either hardware kernels on the dedicated hardware accelerators or software kernels on the general-purpose processors. Using the automatic design flow, the corresponding hybrid hardware platform is generated along with the executable. We use the matrix multiplication of 512×512 to examine the potential of our hybrid system in terms of performance, scalability, and productivity. The results show that hardware kernels reach more than 10 times speedup compared with the software kernels. Our prototype platform also demonstrates a good performance scalability when the number of group computation units (GCUs) increases from 1 to 6 until it becomes a memory bound problem. Compared with the hard ARM core on the Zynq 7045 device, we find that the performance of one ARM core is equivalent to 2 or 3 GCUs with software kernel implementations. On the other hand, a single GCU with hardware kernel implementation is 5 times faster than the ARM core.
Keywords
field programmable gate arrays; matrix multiplication; parallel programming; ARM core; FPGA; HOpenCL parallel programming model; MPSoC; OpenCL-flavor programming model; Zynq 7045 device; field programmable gate array; general-purpose processors; group computation unit; hardware accelerators; hardware kernel; kernel function; matrix multiplication; memory bound problem; multiprocessor system-on-chip; scalable hybrid hardware platform; software kernel; Field programmable gate arrays; Hardware; Kernel; Program processors; Programming; System-on-chip;
fLanguage
English
Publisher
ieee
Conference_Titel
ReConFigurable Computing and FPGAs (ReConFig), 2014 International Conference on
Conference_Location
Cancun
Print_ISBN
978-1-4799-5943-3
Type
conf
DOI
10.1109/ReConFig.2014.7032563
Filename
7032563
Link To Document