DocumentCode
3200113
Title
A High Performance, Energy Efficient GALS ProcessorMicroarchitecture with Reduced Implementation Complexity
Author
Zhu, YongKang ; Albonesi, David H. ; Buyuktosunoglu, Alper
Author_Institution
Dept. of Electr. & Comput. Eng., Rochester Univ., NY
fYear
2005
fDate
20-22 March 2005
Firstpage
42
Lastpage
53
Abstract
As the costs and challenges of global clock distribution grow with each new microprocessor generation, a globally asynchronous, locally synchronous (GALS) approach becomes an attractive alternative. One proposed GALS approach, called a multiple clock domain (MCD) processor, achieves impressive energy savings for a relatively low performance cost. However, the approach requires separating the processor into four domains, including separating the integer and memory domains which complicates load scheduling, and the implementation of 32 voltage and frequency levels in each domain. In addition, the hardware-based control algorithm, though effective overall, produces a significant performance degradation for some applications. In this paper, we devise modifications to the MCD design that retain many of its benefits while greatly reducing the implementation complexity. We first determine that the synchronization channels that are most responsible for the MCD performance degradation are those involving cache access, and propose merging the integer and memory domains to virtually eliminate this overhead. We further propose significantly reducing the number of voltage levels, separating the reorder buffer into its own domain to permit front-end frequency scaling, separating the L2 cache to permit standard power optimizations to be used, and a new online algorithm that provides consistent results across our benchmark suite. The overall result is a significant reduction in the performance degradation of the original MCD approach and greater energy savings, with a greatly simplified microarchitecture that is much easier to implement
Keywords
benchmark testing; cache storage; clocks; computer architecture; microprocessor chips; synchronisation; L2 cache; benchmark suite; energy efficient GALS processor; globally asynchronous locally synchronous approach; microarchitecture; microprocessor chip; multiple clock domain processor; reorder buffer; synchronization channels; Clocks; Costs; Degradation; Energy efficiency; Frequency synchronization; Merging; Microprocessors; Processor scheduling; Synchronous generators; Voltage;
fLanguage
English
Publisher
ieee
Conference_Titel
Performance Analysis of Systems and Software, 2005. ISPASS 2005. IEEE International Symposium on
Conference_Location
Austin, TX
Print_ISBN
0-7803-8965-4
Type
conf
DOI
10.1109/ISPASS.2005.1430558
Filename
1430558
Link To Document