Title :
Combining Process-Based Cache Partitioning and Pollute Region Isolation to Improve Shared Last Level Cache Utilization on Multicore Systems
Author :
Tao Huang ; Jing Wang ; Xuetao Guan ; Qi Zhong ; Keyi Wang
Author_Institution :
Microprocessor R&D Center, Peking Univ., Beijing, China
Abstract :
Shared last level cache has been widely used in modern multicore processors. However, uncontrolled cache sharing on multicore leads to more serious cache pollution than that on single-core processor. A process with weak locality can evict strong locality data sets that belong to other concurrent ones. Processes in multiprocessing environment always affect each other on multicore systems with shared last level cache. Prior approaches either partition shared cache in process level to reduce inter-process cache contention, or isolate the non-temporal memory accesses in order to accelerate single application execution. Process-based cache partitioning may make intra-process cache pollution more serious and have great impact on single process performance. In this work, we take an alternative view to explore physical page layout optimization by combining process-based cache partitioning and pollute region isolation for improving the shared last level cache utilization on multicore systems. Our proposed approach includes three steps. The first step determines the cache sizes of co-scheduled applications and the second step recognizes weak-locality regions of each application on different cache size configurations. Lastly, the third step customizes the physical page layout to partition cache space among concurrent processes and set up global pollute buffer for mapping pollute regions into a small slice of shared last level cache. Our approach is directly used in commercial multicore systems without any additional hardware requirement. Our experimental results show that in comparison with default Linux memory management scheme, our approach improves performance by 26.73% on average. Even compared to the process-based cache partitioning RapidMRC, our approach further eliminates the harmful effect of non-reusable data, and system performance is also improved by 5.63% on average.
Keywords :
cache storage; concurrency control; multiprocessing systems; processor scheduling; cache size configurations; co-scheduled applications; concurrent processes; global pollute buffer; intraprocess cache pollution; multicore processors; multicore systems; multiprocessing environment; nonreusable data effect elimination; physical page layout optimization; pollute region isolation; process-based cache space partitioning; shared last-level cache utilization improvement; strong-locality data sets; system performance improvement; uncontrolled cache sharing; weak-locality process; weak-locality regions; Art; Benchmark testing; Image color analysis; Multicore processing; Optimization; Pollution; Program processors; Cache Partitioning; Cache Pollution; Memory Region; Miss rate Curves; Page Coloring;
Conference_Titel :
Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on
Conference_Location :
Melbourne, VIC
DOI :
10.1109/TrustCom.2013.139