DocumentCode :
3738051
Title :
OpenCL library of stream memory components targeting FPGAs
Author :
Jasmina Vasiljevic;Ralph Wittig;Paul Schumacher;Jeff Fifield;Fernando Martinez Vallina;Henry Styles;Paul Chow
Author_Institution :
University of Toronto, Department of Electrical and Computer Engineering, Ontario, Canada
fYear :
2015
Firstpage :
104
Lastpage :
111
Abstract :
In recent years, high-level languages and compilers, such as OpenCL have improved both productivity and FPGA adoption on a wider scale. One of the challenges in the design of high-performance stream FPGA applications is iterative manual optimization of the numerous application buffers (e.g., arrays, FIFOs and scratch-pads). First, to achieve the desired throughput, the programmer faces the burden of analyzing the memory accesses of each application buffer, and based on observed data locality determines the optimal on-chip buffering, and off-chip read/write data access strategy. Second, to minimize throughput bottlenecks, the programmer has to carefully partition the limited on-chip memory resources among many application buffers. In this work we present an FPGA OpenCL library of pre-optimized stream memory components (SMCs). The library contains three types of SMCs, which implement frequently applied data transformations: 1) stencil, 2) transpose and 3) tiling. The library generates SMCs that are optimized both for the specific data transformation they perform as well as the user specified data set size. Further, to ease the partitioning of on-chip memory resources among many application memories, the library automatically maps application buffers to on-chip and off-chip memory resources. This is achieved by enabling the programmer to specify an on-chip memory budget for each component. In terms of on-chip memory, the SMCs perform data buffering to exploit data locality and maximize reuse. In terms of off-chip memory accesses, the SMCs optimize read/write memory operations by performing data coalescing, bursting and prefetching. We show that using the SMC library, the programmer can quickly generate scalable, pre-optimized stream application memory components, thus reaching throughput targets without time consuming manual memory optimization.
Keywords :
"Kernel","Libraries","Field programmable gate arrays","Optimization","System-on-chip","Throughput","Streaming media"
Publisher :
ieee
Conference_Titel :
Field Programmable Technology (FPT), 2015 International Conference on
Type :
conf
DOI :
10.1109/FPT.2015.7393134
Filename :
7393134
Link To Document :
بازگشت