DocumentCode
2817415
Title
Decoupling local variable accesses in a wide-issue superscalar processor
Author
Cho, Sangyeun ; Pen-Chung Yew ; Gyungho Lee
Author_Institution
Samsung Electron. Co., Yongin-City, South Korea
fYear
1999
fDate
1999
Firstpage
100
Lastpage
110
Abstract
Providing adequate data bandwidth is extremely important for a wide-issue superscalar processor to achieve its full performance potential. Adding a large number of ports to a data cache however becomes increasingly inefficient and can add to the hardware complexity significantly. This paper takes an alternative or complementary approach for providing more data bandwidth, called the data-decoupled architecture. The approach, with support from the compiler and/or hardware, partitions the memory stream into two independent streams early in the processor pipeline, and feeds each stream to a separate memory access queue and cache. Under this model, the paper studies the potential of decoupling memory accesses to program´s local variables that are allocated on the run-time stack. Using a set of integer and floating-point programs from the SPEC95 benchmark suite, it is shown that local variable accesses constitute a large portion of all the memory references, while their reference space is very small, averaging around 7 words per (static) procedure. To service local variable accesses quickly, two optimizations fast data forwarding and access combining, are proposed and studied. Some of the important design parameters, such as the cache size, the number of cache ports, and the degree of access combining, are studied based on simulations. The potential performance of the proposed scheme is measured using various configurations, and it is concluded that the scheme can become a viable alternative to building a single multi-ported data cache
Keywords
fault tolerant computing; parallel architectures; performance evaluation; program compilers; SPEC95 benchmark suite; cache size; compiler; data bandwidth; data cache; data-decoupled architecture; floating-point programs; hardware complexity; local variable accesses decoupling; multi-ported data cache; performance potential; wide-issue superscalar processor; Bandwidth; Clocks; Delay; Feeds; Large scale integration; Pipelines; Read only memory; Runtime;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Architecture, 1999. Proceedings of the 26th International Symposium on
Conference_Location
Atlanta, GA
ISSN
1063-6897
Print_ISBN
0-7695-0170-2
Type
conf
DOI
10.1109/ISCA.1999.765943
Filename
765943
Link To Document