DocumentCode
2379692
Title
A simple latency tolerant processor
Author
Nekkalapu, S. ; Akkary, H. ; Jothi, Komal ; Retnamma, Renjith ; Song, Xiaoyu
Author_Institution
Electr. & Comput. Eng., American Univ. of Beirut, Beirut
fYear
2008
fDate
12-15 Oct. 2008
Firstpage
384
Lastpage
389
Abstract
The advent of multi-core processors and the emergence of new parallel applications that take advantage of such processors pose difficult challenges to designers. With relatively constant die sizes, limited on chip cache, and scarce pin bandwidth, more cores on chip reduces the amount of available cache and bus bandwidth per core, therefore exacerbating the memory wall problem. How can a designer build a processor that provides a core with good single-thread performance in the presence of long latency cache misses, while enabling as many of these cores to be placed on the same die for high throughput. Conventional latency tolerant architectures that use out-of-order superscalar execution have become too complex and power hungry for the multi-core era. Instead, we present a simple, non-blocking architecture that achieves memory latency tolerance without requiring complex out-of-order execution hardware or large, cycle-critical and power hungry structures, such as dynamic schedulers, fully associative load and store queues, and reorder buffers. The non-blocking property of this architecture provides tolerance to hundreds of cycles of cache miss latency on a simple in-order issue core, thus allowing many more such cores to be integrated on the same die than is possible with conventional out-of-order superscalar architecture.
Keywords
cache storage; logic design; microprocessor chips; multiprocessing systems; bus bandwidth; chip cache miss latency; die size; memory latency tolerant multicore processor design; memory wall problem; out-of-order superscalar architecture execution; parallel application; scarce pin bandwidth; single-thread performance; Bandwidth; Buffer storage; Delay; Dynamic scheduling; Hardware; Memory architecture; Multicore processing; Out of order; Process design; Throughput;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Design, 2008. ICCD 2008. IEEE International Conference on
Conference_Location
Lake Tahoe, CA
ISSN
1063-6404
Print_ISBN
978-1-4244-2657-7
Electronic_ISBN
1063-6404
Type
conf
DOI
10.1109/ICCD.2008.4751889
Filename
4751889
Link To Document