DocumentCode :
3127902
Title :
A Traversal Cache Framework for FPGA Acceleration of Pointer Data Structures: A Case Study on Barnes-Hut N-body Simulation
Author :
Coole, James ; Wernsing, John ; Stitt, Greg
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Florida, Gainesville, FL, USA
fYear :
2009
fDate :
9-11 Dec. 2009
Firstpage :
143
Lastpage :
148
Abstract :
Numerous studies have shown that field-programmable gate arrays (FPGAs) often achieve large speedups compared to microprocessors. However, one significant limitation of FPGAs that has prevented their use on important applications is the requirement for regular memory access patterns. Traversal caches were previously introduced to improve the performance of FPGA implementations of algorithms with irregular memory access patterns, especially those traversing pointer-based data structures. However, a significant limitation of previous traversal caches is that speedup was limited to traversals repeated frequently over time, thus preventing speedup for algorithms without repetition, even if the similarity between traversals was large. This paper presents a new framework that extends traversal caches to enable performance improvements in such cases and provides additional improvements through reduced memory accesses and parallel processing of multiple traversals. Most importantly, we show that, for algorithms with highly similar traversals, the traversal cache framework achieves approximately linear kernel speedup with additional area, thus eliminating the memory bandwidth bottleneck commonly associated with FPGAs. We evaluate the framework using a Barnes-Hut n-body simulation case study, showing application speedups ranging from 12× to 13.5× on a Virtex4 LX100 with projected speedups as high as 40× on today´s largest FPGAs.
Keywords :
cache storage; field programmable gate arrays; microprocessor chips; Barnes-Hut N-body simulation; FPGA acceleration; Virtex4 LX100; field-programmable gate arrays; irregular memory access patterns; linear kernel speedup; microprocessors; pointer data structures; reduced memory accesses; regular memory access patterns; traversal cache framework; Acceleration; Bandwidth; Computational modeling; Computer simulation; Data structures; Field programmable gate arrays; Kernel; Linear approximation; Microprocessors; Parallel processing; FPGA; pointers; speedup; traversal cache;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reconfigurable Computing and FPGAs, 2009. ReConFig '09. International Conference on
Conference_Location :
Quintana Roo
Print_ISBN :
978-1-4244-5293-4
Electronic_ISBN :
978-0-7695-3917-1
Type :
conf
DOI :
10.1109/ReConFig.2009.68
Filename :
5382042
Link To Document :
بازگشت