Title :
Characterizing multi-threaded applications for designing sharing-aware last-level cache replacement policies
Author :
Natarajan, R. ; Chaudhuri, Mainak
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Minnesota, Minneapolis, MN, USA
Abstract :
Recent years have seen a large volume of proposals on managing the shared last-level cache (LLC) of chip-multiprocessors (CMPs). However, most of these proposals primarily focus on reducing the amount of destructive interference between competing independent threads of multi-programmed workloads. While very few of these studies evaluate the proposed policies on shared memory multi-threaded applications, they do not improve constructive cross-thread sharing of data in the LLC In this paper, we characterize a set of multi-threaded applications drawn from the PARSEC, SPEC OMP, and SPLASH-2 suites with the goal of introducing sharing-awareness in LLC replacement policies. We motivate our characterization study by quantifying the potential contributions of the shared and the private blocks toward the overall volume of the LLC hits in these applications and show that the shared blocks are more important than the private blocks. Next, we characterize the amount of sharing-awareness enjoyed by recent proposals compared to the optimal policy. We design and evaluate a generic oracle that can be used in conjunction with any existing policy to quantify the potential improvement that can come from introducing sharing-awareness. The oracle analysis shows that introducing sharing-awareness reduces the number of LLC misses incurred by the least-recently-used (LRU) policy by 6% and 10% on average for a 4MB and 8MB LLC respectively. A realistic implementation of this oracle requires the LLC controller to have the capability to accurately predict, at the time a block is filled into the LLC, whether the block will be shared during its residency in the LLC. We explore the feasibility of designing such a predictor based on the address of the fill and the program counter of the instruction that triggers the fill. Our sharing behavior predictability study of two history-based fill-time predictors that use block addresses and program counters concludes that achieving acceptable levels of a- curacy with such predictors will require other architectural and/or high-level program semantic features that have strong correlations with active sharing phases of the LLC blocks.
Keywords :
cache storage; microprocessor chips; multi-threading; multiprogramming; shared memory systems; CMP; LLC controller; LLC replacement policies; LRU policy; PARSEC suite; SPEC OMP suite; SPLASH-2 suite; active LLC block sharing phases; architectural features; block addresses; chip-multiprocessors; constructive cross-thread data sharing; high-level program semantic features; history-based fill-time predictors; least-recently-used policy; multiprogrammed workloads; optimal policy; oracle analysis; private blocks; program counter; shared memory multithreaded applications; sharing behavior predictability; sharing-aware last-level cache replacement policies; sharing-awareness; Correlation; Digital signal processing; Load modeling;
Conference_Titel :
Workload Characterization (IISWC), 2013 IEEE International Symposium on
Conference_Location :
Portland, OR
Print_ISBN :
978-1-4799-0553-9
DOI :
10.1109/IISWC.2013.6704665