Title :
Performance and area aware replacement policy for GPU architecture
Author :
Abadi, Fatemeh Kazemi Hassan ; Safari, Saeed
Author_Institution :
Sch. of Electr. & Comput. Eng., Univ. of Tehran, Tehran, Iran
Abstract :
Recent studies have shown that cache partitioning is an efficient technique to improve throughput in multi-core processors. The existing cache partitioning algorithms assume Least Recently Used (LRU) as underlying replacement policy. We propose old Tree-based PLRU on two-level caches with higher speed up or performance matching of LRU at GPUs. The algorithm is based on Pseudo LRU that uses binary tree to reduce area overhead. Also, it uses set-dueling to dynamically adapt its insertion and promotion. We evaluate effect of this policy on both L1 and L2 caches in GPUs. We propose a high accuracy profiling logic and a cache partitioning hardware for our scheme. We evaluate the hardware costs in terms of performance, miss rates, DRAM locality, area, energy, and compare them with LRU and FIFO partitioning algorithms. We define a set of machine models to discuss our scheme on some general purpose workloads. The results show that our solutions impose negligible performance degradation comparing LRU. Then, we use insertion and promotion vectors to compensate for drop of performance. On compute workloads, the technique reduces L2 miss rate about 10.11%.
Keywords :
cache storage; graphics processing units; multiprocessing systems; performance evaluation; DRAM locality; FIFO partitioning algorithms; GPU architecture; L2 miss rate; area aware replacement policy; binary tree; cache partitioning algorithms; general purpose workloads; insertion vectors; least recently used; miss rates; multicore processors; performance aware replacement policy; performance degradation; performance matching; promotion vectors; pseudo LRU; set-dueling; tree-based PLRU; two-level caches; Benchmark testing; Computational modeling; Graphics processing units; Measurement; Memory management; Random access memory; Vectors; GPU; Insertion and promotion vector (IPV); Tree-based PLRU;
Conference_Titel :
Computer and Knowledge Engineering (ICCKE), 2014 4th International eConference on
Conference_Location :
Mashhad
Print_ISBN :
978-1-4799-5486-5
DOI :
10.1109/ICCKE.2014.6993378