DocumentCode :
2534834
Title :
A criticality analysis of clustering in superscalar processors
Author :
Salverda, Pierre ; Zilles, Craig
Author_Institution :
Dept. of Comput. Sci., Illinois Univ., Urbana, IL
fYear :
2005
fDate :
16-16 Nov. 2005
Lastpage :
66
Abstract :
Clustered machines partition hardware resources to circumvent the cycle time penalties incurred by large, monolithic structures. This partitioning introduces a long inter-cluster forwarding latency and the potential for load imbalance, both of which degrade IPC and thus counter the cycle time benefits of clustering. We show that program dataflow can be mapped to clustered machines so as to achieve an IPC rivaling that of an equivalent monolithic machine. That is, the IPC penalties observed by extant schemes are largely an artifact of instruction steering and scheduling policies. Using critical path analysis, we investigate and uncover the main causes for this performance loss. By way of code samples, we illustrate those causes and propose three policies for mitigating them. First, we introduce a new metric, likelihood of criticality, and show how it can halve the performance lost to contention-induced stalls. Second, we develop a stall-over-steer policy that addresses performance lost to inter-cluster forwarding delay. Finally, we show that a proactive load-balancing policy is necessary to improve the distribution of ready instructions among the clusters. Together, these three policies yield performance on 2-, 4- and 8-cluster implementations of an 8-wide machine that is within 2, 4, and 6%, respectively, of the monolithic equivalent
Keywords :
data flow computing; fault tolerant computing; multiprocessing systems; resource allocation; workstation clusters; clustered machines; criticality analysis; hardware resource partitioning; instruction steering; intercluster forwarding latency; proactive load-balancing policy; program data flow; scheduling policy; stall-over-steer policy; superscalar processor clustering; Aggregates; Clocks; Computer science; Counting circuits; Degradation; Delay; Hardware; Out of order; Parallel processing; Processor scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Microarchitecture, 2005. MICRO-38. Proceedings. 38th Annual IEEE/ACM International Symposium on
Conference_Location :
Barcelona
Print_ISBN :
0-7695-2440-0
Type :
conf
DOI :
10.1109/MICRO.2005.6
Filename :
1540948
Link To Document :
بازگشت