مرکز منطقه ای اطلاع رساني علوم و فناوري - TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture

DocumentCode :

3539860

Title :

TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture

Author :

Lee, Jaekyu ; Kim, Hyesoon

Author_Institution :

Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA

fYear :

2012

fDate :

25-29 Feb. 2012

Firstpage :

Lastpage :

Abstract :

Combining CPUs and GPUs on the same chip has become a popular architectural trend. However, these heterogeneous architectures put more pressure on shared resource management. In particular, managing the last-level cache (LLC) is very critical to performance. Lately, many researchers have proposed several shared cache management mechanisms, including dynamic cache partitioning and promotion-based cache management, but no cache management work has been done on CPU-GPU heterogeneous architectures. Sharing the LLC between CPUs and GPUs brings new challenges due to the different characteristics of CPU and GPGPU applications. Unlike most memory-intensive CPU benchmarks that hide memory latency with caching, many GPGPU applications hide memory latency by combining thread-level parallelism (TLP) and caching. In this paper, we propose a TLP-aware cache management policy for CPU-GPU heterogeneous architectures. We introduce a core-sampling mechanism to detect how caching affects the performance of a GPGPU application. Inspired by previous cache management schemes, Utility-based Cache Partitioning (UCP) and Re-Reference Interval Prediction (RRIP), we propose two new mechanisms: TAP-UCP and TAP-RRIP. TAP-UCP improves performance by 5% over UCP and 11% over LRU on 152 heterogeneous workloads, and TAP-RRIP improves performance by 9% over RRIP and 12% over LRU.

Keywords :

cache storage; graphics processing units; CPU-GPU heterogeneous architecture; TAP; TAP-RRIP; TAP-UCP; TLP-aware cache management policy; core-sampling mechanism; dynamic cache partitioning; last-level cache management; promotion-based cache management; rereference interval prediction; several shared cache management mechanisms; shared resource management; thread-level parallelism; utility-based cache partitioning; Benchmark testing; Computer architecture; Graphics processing unit; Instruction sets; Measurement; Radiation detectors; System-on-a-chip;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on

Conference_Location :

New Orleans, LA

ISSN :

1530-0897

Print_ISBN :

978-1-4673-0827-4

Electronic_ISBN :

1530-0897

Type :

conf

DOI :

10.1109/HPCA.2012.6168947

Filename :

6168947

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3539860