Title :
Hardware pipelining of runtime-detected loops
Author :
Bispo, Joao ; Cardoso, Joao M P ; Monteiro, Jose
Author_Institution :
CSE Dept., UTL, Lisbon, Portugal
fDate :
Aug. 30 2012-Sept. 2 2012
Abstract :
Dynamic partitioning is a promising technique where computations are transparently moved from a General Purpose Processor (GPP) to a coprocessor during application execution. To be effective, the mapping of computations to the coprocessor needs to consider aggressive optimizations. One of the mapping optimizations is loop pipelining, a technique extensively studied and known to allow substantial performance improvements. This paper describes a technique for pipelining Megablocks, a type of runtime loop developed for dynamic partitioning. The technique transforms the body of Megab-locks into an acyclic dataflow graph which can be fully pipelined and is based on the atomic execution of loop iterations. For a set of 9 benchmarks without memory operations, we generated pipelined hardware versions of the loops and estimate that the presented loop pipelining technique increases the average speedup of non-pipelined coprocessor accelerated designs from 1.6× to 2.2×. For a larger set of 61 benchmarks which include memory operations, the technique achieves a speedup increase from 2.5× to 5.6×.
Keywords :
coprocessors; data flow graphs; parallel architectures; pipeline processing; GPP; Megablock pipelining; acyclic data flow graph; atomic execution; dynamic mapping; dynamic partitioning; general purpose processor; hardware pipelining; loop iterations; loop pipelining; mapping optimization; memory operations; nonpipelined coprocessor accelerated design; runtime-detected loops; Artificial intelligence; Dynamic Mapping; Hardware Acceleration; Instruction Traces; Loop Pipelining; Reconfigurable Fabrics;
Conference_Titel :
Integrated Circuits and Systems Design (SBCCI), 2012 25th Symposium on
Conference_Location :
Brasilia
Print_ISBN :
978-1-4673-2606-3
DOI :
10.1109/SBCCI.2012.6344443