DocumentCode :
3200113
Title :
A High Performance, Energy Efficient GALS ProcessorMicroarchitecture with Reduced Implementation Complexity
Author :
Zhu, YongKang ; Albonesi, David H. ; Buyuktosunoglu, Alper
Author_Institution :
Dept. of Electr. & Comput. Eng., Rochester Univ., NY
fYear :
2005
fDate :
20-22 March 2005
Firstpage :
42
Lastpage :
53
Abstract :
As the costs and challenges of global clock distribution grow with each new microprocessor generation, a globally asynchronous, locally synchronous (GALS) approach becomes an attractive alternative. One proposed GALS approach, called a multiple clock domain (MCD) processor, achieves impressive energy savings for a relatively low performance cost. However, the approach requires separating the processor into four domains, including separating the integer and memory domains which complicates load scheduling, and the implementation of 32 voltage and frequency levels in each domain. In addition, the hardware-based control algorithm, though effective overall, produces a significant performance degradation for some applications. In this paper, we devise modifications to the MCD design that retain many of its benefits while greatly reducing the implementation complexity. We first determine that the synchronization channels that are most responsible for the MCD performance degradation are those involving cache access, and propose merging the integer and memory domains to virtually eliminate this overhead. We further propose significantly reducing the number of voltage levels, separating the reorder buffer into its own domain to permit front-end frequency scaling, separating the L2 cache to permit standard power optimizations to be used, and a new online algorithm that provides consistent results across our benchmark suite. The overall result is a significant reduction in the performance degradation of the original MCD approach and greater energy savings, with a greatly simplified microarchitecture that is much easier to implement
Keywords :
benchmark testing; cache storage; clocks; computer architecture; microprocessor chips; synchronisation; L2 cache; benchmark suite; energy efficient GALS processor; globally asynchronous locally synchronous approach; microarchitecture; microprocessor chip; multiple clock domain processor; reorder buffer; synchronization channels; Clocks; Costs; Degradation; Energy efficiency; Frequency synchronization; Merging; Microprocessors; Processor scheduling; Synchronous generators; Voltage;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Performance Analysis of Systems and Software, 2005. ISPASS 2005. IEEE International Symposium on
Conference_Location :
Austin, TX
Print_ISBN :
0-7803-8965-4
Type :
conf
DOI :
10.1109/ISPASS.2005.1430558
Filename :
1430558
Link To Document :
بازگشت