مرکز منطقه ای اطلاع رساني علوم و فناوري - An energy efficient GPGPU memory hierarchy with tiny incoherent caches

DocumentCode :

3496774

Title :

An energy efficient GPGPU memory hierarchy with tiny incoherent caches

Author :

Sankaranarayanan, Alamelu ; Ardestani, Ehsan K. ; Briz, Jose Luis ; Renau, Jose

Author_Institution :

Dept. of Comput. Eng., Univ. of California Santa Cruz, Santa Cruz, CA, USA

fYear :

2013

fDate :

4-6 Sept. 2013

Firstpage :

Lastpage :

Abstract :

With progressive generations and the ever-increasing promise of computing power, GPGPUs have been quickly growing in size, and at the same time, energy consumption has become a major bottleneck for them. The first level data cache and the scratchpad memory are critical to the performance of a GPGPU, but they are extremely energy inefficient due to the large number of cores they need to serve. This problem could be mitigated by introducing a cache higher up in hierarchy that services fewer cores, but this introduces cache coherency issues that may become very significant, especially for a GPGPU with hundreds of thousands of in-flight threads. In this paper, we propose adding incoherent tinyCaches between each lane in an SM, and the first level data cache that is currently shared by all the lanes in an SM. In a normal multiprocessor, this would require hardware cache coherence between all the SM lanes capable of handling hundreds of thousands of threads. Our incoherent tinyCache architecture exploits certain unique features of the CUDA/OpenCL programming model to avoid complex coherence schemes. This tinyCache is able to filter out 62% of memory requests that would otherwise need to be serviced by the DL1G, and almost 81% of scratchpad memory requests, allowing us to achieve a 37% energy reduction in the on-chip memory hierarchy. We evaluate the tinyCache for different memory patterns and show that it is beneficial in most cases.

Keywords :

cache storage; graphics processing units; multiprocessing systems; CUDA/OpenCL programming model; energy efficient GPGPU memory hierarchy; first level data cache; hardware cache coherence; incoherent tinyCache architecture; memory patterns; multiprocessors; on-chip memory hierarchy; scratchpad memory; Benchmark testing; Coherence; Computer architecture; Graphics processing units; Instruction sets; Registers; System-on-chip; Caches; Energy-efficiency; GPGPUs; Memory hierarchy;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Low Power Electronics and Design (ISLPED), 2013 IEEE International Symposium on

Conference_Location :

Beijing

Print_ISBN :

978-1-4799-1234-6

Type :

conf

DOI :

10.1109/ISLPED.2013.6629259

Filename :

6629259

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3496774