DocumentCode
2535015
Title
Address-indexed memory disambiguation and store-to-load forwarding
Author
Stone, Sam S. ; Woley, Kevin M. ; Frank, Matthew I.
Author_Institution
Dept. of Electr. & Comput. Eng., Illinois Univ., Urbana, IL
fYear
2005
fDate
16-16 Nov. 2005
Lastpage
182
Abstract
This paper describes a scalable, low-complexity alternative to the conventional load/store queue (LSQ) for superscalar processors that execute load and store instructions speculatively and out-of-order prior to resolving their dependences. Whereas the LSQ requires associative and age-prioritized searches for each access, we propose that an address-indexed store-forwarding cache (SFC) perform store-to-load forwarding and that an address-indexed memory disambiguation table (MDT) perform memory disambiguation. Neither structure includes a CAM. The SFC behaves as a small cache, accessed speculatively and out-of-order by both loads and stores. Because the SFC does not rename in-flight stores to the same address, violations of memory anti and output dependences can cause in-flight loads to obtain incorrect values from the SFC. Therefore, the MDT uses sequence numbers to detect and recover from true, anti, and output memory dependence violations. We observe empirically that loads and stores that violate anti and output memory dependences are rarely on a program´s critical path and that the additional cost of enforcing predicted anti and output dependences among these loads and stores is minimal. In conjunction with a scheduler that enforces predicted anti and output dependences, the MDT and SFC yield performance equivalent to that of a large LSQ that has similar or greater circuit complexity. The SFC and MDT are scalable structures that yield high performance and lower dynamic power consumption than the LSQ, and they are well-suited for checkpointed processors with large instruction windows
Keywords
cache storage; memory architecture; storage allocation; address-indexed memory disambiguation; circuit complexity; instruction windows; load and store instructions; memory dependence violations; store-forwarding cache; store-to-load forwarding; superscalar processors; Buffer storage; CADCAM; Complexity theory; Computer aided manufacturing; Costs; Delay; Energy consumption; Microarchitecture; Out of order; Retirement;
fLanguage
English
Publisher
ieee
Conference_Titel
Microarchitecture, 2005. MICRO-38. Proceedings. 38th Annual IEEE/ACM International Symposium on
Conference_Location
Barcelona
Print_ISBN
0-7695-2440-0
Type
conf
DOI
10.1109/MICRO.2005.10
Filename
1540958
Link To Document