DocumentCode
128992
Title
Reducing set-associative L1 data cache energy by early load data dependence detection (ELD3)
Author
Bardizbanyan, A. ; Sjalander, M. ; Whalley, David ; Larsson-Edefors, Per
Author_Institution
Chalmers Univ. of Technol., Gothenburg, Sweden
fYear
2014
fDate
24-28 March 2014
Firstpage
1
Lastpage
4
Abstract
Fast set-associative level-one data caches (L1 DCs) access all ways in parallel during load operations for reduced access latency. This is required in order to resolve data dependencies as early as possible in the pipeline, which otherwise would suffer from stall cycles. A significant amount of energy is wasted due to this fast access, since the data can only reside in one of the ways. While it is possible to reduce L1 DC energy usage by accessing the tag and data memories sequentially, hence activating only one data way on a tag match, this approach significantly increases execution time due to an increased number of stall cycles. We propose an early load data dependency detection (ELD3) technique for in-order pipelines. This technique makes it possible to detect if a load instruction has a data dependency with a subsequent instruction. If there is no such dependency, then the tag and data accesses for the load are sequentially performed so that only the data way in which the data resides is accessed. If there is a dependency, then the tag and data arrays are accessed in parallel to avoid introducing additional stall cycles. For the MiBench benchmark suite, the ELD3 technique enables about 49% of all load operations to access the L1 DC sequentially. Based on 65-nm data using commercial SRAM blocks, the proposed technique reduces L1 DC energy by 13%.
Keywords
SRAM chips; cache storage; content-addressable storage; ELD3 technique; L1 DCs; MiBench benchmark suite; SRAM blocks; data arrays; data memories; early load data dependence detection; fast set-associative level-one data caches; in-order pipelines; reduced access latency; set-associative L1 data cache energy reduction; size 65 nm; Arrays; Benchmark testing; Integrated circuits; Pipelines; Program processors; Random access memory; Registers;
fLanguage
English
Publisher
ieee
Conference_Titel
Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014
Conference_Location
Dresden
Type
conf
DOI
10.7873/DATE.2014.095
Filename
6800296
Link To Document