DocumentCode :
695236
Title :
Studying the impact of multicore processor scaling on directory techniques via reuse distance analysis
Author :
Minshu Zhao ; Yeung, Donald
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Maryland at Coll. Park, College Park, MD, USA
fYear :
2015
fDate :
7-11 Feb. 2015
Firstpage :
590
Lastpage :
602
Abstract :
Researchers have proposed numerous directory techniques to address multicore scalability whose behavior depends on the CPU´s particular configuration, e.g. core count and cache size. As CPUs continue to scale, it is essential to explore the directory´s architecture dependences. However, this is challenging using detailed simulation given the large number of CPU configurations that are possible. This paper proposes to use multicore reuse distance analysis to study coherence directories. We develop a framework to extract the directory access stream from parallel LRU stacks, enabling rapid analysis of the directory´s accesses and contents across both core count and cache size scaling. We also implement our framework in a profiler, and apply it to gain insights into multicore scaling´s impact on the directory. Our profiling results show that directory accesses reduce by 3.5x across data cache size scaling, suggesting techniques that tradeoff access latency for reduced capacity or conflicts become increasingly effective as cache size scales. We also show the portion of on-chip memory devoted to the directory cache can be reduced by 53.3% across data cache size scaling, thus lowering the over-provisioning needed at large cache sizes. Finally, we validate our RD-based directory analyses, and find they are within 13% of cache simulations in terms of access count, on average.
Keywords :
cache storage; microprocessor chips; multiprocessing systems; CPU configuration; CPU particular configuration; RD-based directory analysis; cache simulation; cache size scale; coherence directory; core count; data cache size scaling; directory access stream; directory cache; directory technique; multicore processor scaling; multicore reuse distance analysis; multicore scalability; multicore scaling impact; on-chip memory; parallel LRU stack; profiler; tradeoff access latency; Coherence; Educational institutions; Histograms; Multicore processing; Protocols; Radiation detectors; System-on-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on
Conference_Location :
Burlingame, CA
Type :
conf
DOI :
10.1109/HPCA.2015.7056065
Filename :
7056065
Link To Document :
بازگشت