DocumentCode :
3539860
Title :
TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture
Author :
Lee, Jaekyu ; Kim, Hyesoon
Author_Institution :
Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2012
fDate :
25-29 Feb. 2012
Firstpage :
1
Lastpage :
12
Abstract :
Combining CPUs and GPUs on the same chip has become a popular architectural trend. However, these heterogeneous architectures put more pressure on shared resource management. In particular, managing the last-level cache (LLC) is very critical to performance. Lately, many researchers have proposed several shared cache management mechanisms, including dynamic cache partitioning and promotion-based cache management, but no cache management work has been done on CPU-GPU heterogeneous architectures. Sharing the LLC between CPUs and GPUs brings new challenges due to the different characteristics of CPU and GPGPU applications. Unlike most memory-intensive CPU benchmarks that hide memory latency with caching, many GPGPU applications hide memory latency by combining thread-level parallelism (TLP) and caching. In this paper, we propose a TLP-aware cache management policy for CPU-GPU heterogeneous architectures. We introduce a core-sampling mechanism to detect how caching affects the performance of a GPGPU application. Inspired by previous cache management schemes, Utility-based Cache Partitioning (UCP) and Re-Reference Interval Prediction (RRIP), we propose two new mechanisms: TAP-UCP and TAP-RRIP. TAP-UCP improves performance by 5% over UCP and 11% over LRU on 152 heterogeneous workloads, and TAP-RRIP improves performance by 9% over RRIP and 12% over LRU.
Keywords :
cache storage; graphics processing units; CPU-GPU heterogeneous architecture; TAP; TAP-RRIP; TAP-UCP; TLP-aware cache management policy; core-sampling mechanism; dynamic cache partitioning; last-level cache management; promotion-based cache management; rereference interval prediction; several shared cache management mechanisms; shared resource management; thread-level parallelism; utility-based cache partitioning; Benchmark testing; Computer architecture; Graphics processing unit; Instruction sets; Measurement; Radiation detectors; System-on-a-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on
Conference_Location :
New Orleans, LA
ISSN :
1530-0897
Print_ISBN :
978-1-4673-0827-4
Electronic_ISBN :
1530-0897
Type :
conf
DOI :
10.1109/HPCA.2012.6168947
Filename :
6168947
Link To Document :
بازگشت