DocumentCode :
2236637
Title :
Hardware Pessimistic Run-Time Profiling for a Self-Reconfigurable Embedded Processor Architecture
Author :
Agwa, Shady O. ; Ahmad, Hany H. ; Saleh, Awad I.
Author_Institution :
Electr. Eng. Dept., Assiut Univ., Assiut, Egypt
fYear :
2010
fDate :
13-15 Dec. 2010
Firstpage :
162
Lastpage :
167
Abstract :
Embedded processors are expected to immigrate towards self-reconfigurable architectures, and in the near future, the self-reconfiguration concept will be able to support revolutionary architectural innovations. The research described in this paper is a continuation of our previous work [1]. In [1], the advantage of the - so called - pessimistic run-time profiling approach was demonstrated and compared to preparation mode profiling and optimistic run-time profiling. Using pessimistic run-time profiling, a 36.09 % reduction in execution time was achieved compared to 23.02% in the optimistic run-time profiling case. These results were obtained using the “High Pass Grey-Scale Filter” benchmark as our case study [2]. Here, we extend the previous results to include a hardware implementation of pessimistic run-time profiling. Due to the parallel execution of run-time profiling and the running algorithm, execution time reduction was increased to 57.73% on the same benchmark. Total energy consumed by our hardware implementation was only 54 Pj, and some 1,371 gates were added to the design. These two figures represent approximately 1.4×10-5 % and 1.828 % increase in energy consumption and gate count, respectively, as compared to the pessimistic-accelerated case and main core gate count. The hardware run-time profiling unit introduced here is based on the "predetermined basic regions detection" philosophy, and can operate at a maximum frequency of approximately 278 MHz for this particular case study. Profiling each critical region consumes less than 3.6 ns and executes in parallel with the decode stage of the main processor instruction pipeline. Hardware pessimistic run-time profiling is, thus, able to achieve the same speedup of the full acceleration case - in which no profiling is used - with only a marginal increase in area and energy consumption as compared to the non-accelerated case. All results were obtained using Ten silica [3] and Xi- - linx [4] Tools.
Keywords :
embedded systems; low-power electronics; parallel processing; pipeline processing; power aware computing; reconfigurable architectures; Tensilica tool; Xilinx tool; energy consumption; execution time reduction; gate count; high pass grey scale filter; parallel execution; pessimistic run-time profiling; predetermined basic regions detection; processor instruction pipeline; self-reconfigurable embedded processor architecture; Acceleration; Architecture; Embedded; Processors; Profiling; Reconfigurable;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reconfigurable Computing and FPGAs (ReConFig), 2010 International Conference on
Conference_Location :
Quintana Roo
Print_ISBN :
978-1-4244-9523-8
Electronic_ISBN :
978-0-7695-4314-7
Type :
conf
DOI :
10.1109/ReConFig.2010.12
Filename :
5695299
Link To Document :
بازگشت