DocumentCode :
704051
Title :
Hybrid adaptive clock management for FPGA processor acceleration
Author :
Gheolbanoiu, Alexandra ; Petrica, Lucian ; Cotofana, Sorin
Author_Institution :
Univ. Politeh. of Bucharest, Bucharest, Romania
fYear :
2015
fDate :
9-13 March 2015
Firstpage :
1359
Lastpage :
1364
Abstract :
As FPGAs speed, power efficiency, and logic capacity are increasing, so does the number of applications which make use of FPGA processors. However, due to placement and routing constraints, FPGA processors instruction delay balancing is a real challenge, especially when the implementation approaches the FPGA resource capacity. Consequently, even though some instructions can operate at high frequencies, the slow instructions determine the processor clock period, resulting in the underutil-isation of the processor potential. However, the fast instructions latent performance may be harnessed through Adaptive Clock Management (ACM), i.e., by dynamically adapting the clock frequency such that each instruction gets sufficient time for correct completion. Up to date, ACM augmented FPGA processors have been proposed based on Clock Multiplexing (CM), but they suffer from long clock switching delays, which could nullify most of the ACM potential performance gain. This paper proposes an effective FPGA tailored clock manipulation approach able to leverage the ACM potential. We first evaluate Clock Stretching (CS), i.e., the temporary clock period augmentation, as a CM alternative in FPGA processor designs and introduce an FPGA specific CS circuit implementation. Subsequently, we evaluate the advantages and drawbacks of the two techniques and propose a Hybrid ACM, which monitors the processor instruction stream and determines the optimal adaptive clocking strategy in order to provide the maximum speedup for the executing program. Given that CS has very low latency at the expense of limited accuracy and dynamic range we rely on it when the program requires frequent clock period changes. Otherwise we utilise CM, which is rather slow but enables the FPGA processor operation at the edge of its hardware capabilities. We evaluate our proposal on a vector processor mapped on a Xilinx Zynq FPGA. Our experiments indicate that on Sum of Squared Differences algorithm, Neural network, and - IR filter execution traces the hybrid ACM provides up to 14% performance increase over the CM based ACM.
Keywords :
clocks; field programmable gate arrays; multiplexing; ACM; CM; CS circuit; FIR filter; FPGA processor acceleration; FPGA resource capacity; FPGA tailored clock manipulation approach; Xilinx Zynq FPGA; clock frequency; clock multiplexing; clock stretching; clock switching delay; field programmable gate array; finite impulse response filter; hybrid adaptive clock management; instruction delay balancing; neural network; optimal adaptive clocking strategy; processor clock period; sum of squared differences algorithm; temporary clock period augmentation; vector processor; Clocks; Delays; Field programmable gate arrays; Multiplexing; Pipelines; Switches; Table lookup;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015
Conference_Location :
Grenoble
Print_ISBN :
978-3-9815-3704-8
Type :
conf
Filename :
7092603
Link To Document :
بازگشت