DocumentCode :
1969736
Title :
A loop accelerator for low power embedded VLIW processors
Author :
Mathew, Binu ; Davis, Al
Author_Institution :
Sch. of Comput., Utah Univ., Salt Lake City, UT, USA
fYear :
2004
fDate :
8-10 Sept. 2004
Firstpage :
6
Lastpage :
11
Abstract :
The high transistor density afforded by modern VLSI processes has enabled the design of embedded processors that use clustered execution units to deliver high levels of performance. However, delivering data to the execution resources in a timely manner remains a major problem that limits ILP. It is particularly significant for embedded systems where memory and power budgets are limited. A distributed address generation and loop acceleration architecture for VLIW processors is presented. This decentralized on-chip memory architecture uses multiple SRAMs to provide high intra-processor bandwidth. Each SRAM has an associated stream address generator capable of implementing a variety of addressing modes in conjunction with a shared loop accelerator. The architecture is extremely useful for generating application specific embedded processors, particularly for processing input data which is organized as a stream. The idea is evaluated in the context of a fine grain VLIW architecture executing complex perception algorithms such as speech and visual feature recognition. Transistor level Spice simulations are used to demonstrate a 159x improvement in the energy delay product when compared to conventional architectures executing the same applications.
Keywords :
SRAM chips; VLSI; embedded systems; integrated circuit design; low-power electronics; memory architecture; multiprocessing systems; VLSI processes; associated stream address generator; clustered execution units; complex perception algorithms; data delivery; decentralized on-chip memory architecture; distributed address generation; loop acceleration architecture; low power design; low power embedded VLIW processors; multiple SRAM; shared loop accelerator; speech feature recognition; transistor level Spice simulations; visual feature recognition; Acceleration; Bandwidth; Distributed power generation; Embedded system; Memory architecture; Process design; Random access memory; Speech analysis; VLIW; Very large scale integration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004. International Conference on
Print_ISBN :
1-58113-937-3
Type :
conf
DOI :
10.1109/CODESS.2004.241153
Filename :
1360471
Link To Document :
بازگشت