DocumentCode
3059648
Title
Inexpensive throughput enhancement in small-scale embedded microprocessors with block multithreading: extensions, characterization, and tradeoffs
Author
Haskins, John W., Jr. ; Hirst, Kevin R. ; Skadron, Kevin
Author_Institution
Dept. of Comput. Sci., Virginia Univ., Charlottesville, VA, USA
fYear
2001
fDate
36982
Firstpage
319
Lastpage
328
Abstract
This paper examines differential multithreading (DMT) as an attractive organization for coping with pipeline stalls in small-scale processors like those used in embedded environments. The paper proposes extensions to block multithreading to cope with data- and instruction-cache misses, and then explores some of the design tradeoffs that this enables. Results show that DMT boosts throughput substantially and can in fact replace dynamic branch prediction or data forwarding, or can be used to reduce the sizes of the instruction and data caches. Block multithreading, described by Farrens and Pleszkun (1991), is a technique to achieve high throughput from a single-issue microarchitecture by switching among multiple instruction streams in response to pipeline stalls. Although single-issue organizations are no longer used in high-performance processors, they remain common even in newly-designed processors for small-scale, embedded devices. Like the original description of block multithreading, DMT uses auxiliary pipeline registers to save the state of in-flight instructions. By coping with data- and instruction-cache misses, however, our implementation can attack all the major sources of pipeline stalls. Overall, we find that DMT can substantially lower the cost and complexity of microprocessors for embedded environments, especially environments for which throughput rather than speed is the primary concern. In addition, DMT is an attractive prospect for use in chip-multiprocessing environments
Keywords
cache storage; embedded systems; fault tolerant computing; microprocessor chips; multi-threading; pipeline processing; auxiliary pipeline registers; block multithreading; chip-multiprocessing environments; data-cache misses; design tradeoffs; differential multithreading; in-flight instructions; instruction-cache misses; multiple instruction stream switching; pipeline stall; single-issue microarchitecture; small-scale embedded microprocessors; throughput enhancement; Computer science; Games; Hardware; Microprocessors; Multithreading; OFDM modulation; Pipelines; Switches; Throughput; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Performance, Computing, and Communications, 2001. IEEE International Conference on.
Conference_Location
Phoenix, AZ
Print_ISBN
0-7803-7001-5
Type
conf
DOI
10.1109/IPCCC.2001.918669
Filename
918669
Link To Document