DocumentCode
3048698
Title
Exploiting memory customization in FPGA for 3D stencil computations
Author
Shafiq, Muhammad ; Pericàs, Miquel ; De la Cruz, Raul ; Araya-Polo, Mauricio ; Navarro, Nacho ; Ayguadé, Eduard
Author_Institution
Comput. Sci., Barcelona Supercomput. Center, Barcelona, Spain
fYear
2009
fDate
9-11 Dec. 2009
Firstpage
38
Lastpage
45
Abstract
3D stencil computations are compute-intensive kernels often appearing in high-performance scientific and engineering applications. The key to efficiency in these memory-bound kernels is full exploitation of data reuse. This paper explores the design aspects for 3D-Stencil implementations that maximize the reuse of all input data on a FPGA architecture. The work focuses on the architectural design of 3D stencils with the form n à (n + 1) à n, where n = {2, 4, 6, 8, ...}. The performance of the architecture is evaluated using two design approaches, ¿Multi-Volume¿ and ¿Single-Volume¿. When n = 8, the designs achieve a sustained throughput of 55.5 GFLOPS in the ¿Single-Volume¿ approach and 103 GFLOPS in the ¿Multi-Volume¿ design approach in a 100-200 MHz multi-rate implementation on a Virtex-4 LX200 FPGA. This corresponds to a stencil data delivery of 1500 bytes/cycle and 2800 bytes/cycle respectively. The implementation is analyzed and compared to two CPU cache approaches and to the statically scheduled local stores on the IBM PowerXCell 8i. The FPGA approaches designed here achieve much higher bandwidth despite the FPGA device being the least recent of the chips considered. These numbers show how a custom memory organization can provide large data throughput when implementing 3D stencil kernels.
Keywords
field programmable gate arrays; signal processing; 3D stencil computations; FPGA; IBM PowerXCell 8i; data reuse; memory customization; memory organization; memory-bound kernels; Bandwidth; Computer applications; Field programmable gate arrays; Finite difference methods; Finite impulse response filter; Hardware; Kernel; Nearest neighbor searches; Throughput; Time domain analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Field-Programmable Technology, 2009. FPT 2009. International Conference on
Conference_Location
Sydney, NSW
Print_ISBN
978-1-4244-4375-8
Electronic_ISBN
978-1-4244-4377-2
Type
conf
DOI
10.1109/FPT.2009.5377644
Filename
5377644
Link To Document