Title :
Reducing latency in an SRAM/DRAM cache hierarchy via a novel Tag-Cache architecture
Author :
Hameed, Fazal ; Bauer, Lujo ; Henkel, Jörg
Author_Institution :
Embedded Syst., Karlsruhe Inst. of Technol., Karlsruhe, Germany
Abstract :
Memory speed has become a major performance bottleneck as more and more cores are integrated on a multi-core chip. The widening latency gap between high speed cores and memory has led to the evolution of multi-level SRAM/DRAM cache hierarchies that exploit the latency benefits of smaller caches (e.g. private L1 and L2 SRAM caches) and the capacity benefits of larger caches (e.g. shared L3 SRAM and shared L4 DRAM cache). The main problem of employing large L3/L4 caches is their high tag lookup latency. To solve this problem, we introduce the novel concept of small and low latency SRAM/DRAM Tag-Cache structures that can quickly determine whether an access to the large L3/L4 caches will be a hit or a miss. The performance of the proposed Tag-Cache architecture depends upon the Tag-Cache hit rate and to improve it we propose a novel Tag-Cache insertion policy and a DRAM row buffer mapping policy that reduce the latency of memory requests. For a 16-core system, this improves the average harmonic mean instruction per cycle throughput of latency sensitive applications by 13.3% compared to state-of-the-art.
Keywords :
DRAM chips; SRAM chips; cache storage; DRAM row buffer mapping policy; SRAM/DRAM cache hierarchy; average harmonic mean instruction; high speed cores; high tag lookup latency; multi-core chip; private L1 SRAM caches; private L2 SRAM caches; shared L3 SRAM cache; shared L4 DRAM cache; tag-cache architecture; tag-cache insertion policy; widening latency gap; Arrays; Monitoring; Multicore processing; Organizations; Program processors; Radiation detectors; Random access memory;
Conference_Titel :
Design Automation Conference (DAC), 2014 51st ACM/EDAC/IEEE
Conference_Location :
San Francisco, CA
DOI :
10.1145/2593069.2593197