Title :
Reducing Static and Dynamic Power of L1 Data Caches in GPGPUs
Author_Institution :
Electr. Eng. Dept., Lakehead Univ., Thunder Bay, ON, Canada
Abstract :
With the widespread adoption of GPGPUs for general purpose computing domain, the size of GPGPUs has grown quickly, making power consumption a major bottleneck. L1 data caches boost performance of processors by hiding latency of memory but consume significant power as they need to serve many processing cores. We propose two optimization techniques to reduce static and dynamic power of L1 data caches in GPGPUs.The first optimization technique reduces static power of L1 data cache by placing cache blocks into drowsy mode immediately after each access. In GPGPUs, the cache blocks are idle for long intervals. Hence, moving a cache block into drowsy state immediately after each access reduces leakage power significantly with negligible performance impact. The second optimization technique targets dynamic power of L1 data cache. Due to branch divergence, threads within a warp may follow different paths of execution. This may result in inactive threads within a warp. Existing GPGPUs access the whole cache blocks, ignoring inactive threads within a warp. We use active mask of GPGPUs and access only the portion of cache blocks that are required by active threads. By dynamically disabling unnecessary sections of cache blocks, we are able to reduce dynamic power of caches.
Keywords :
cache storage; electronic engineering computing; graphics processing units; optimisation; GPGPU; L1 data caches; active mask; dynamic power; leakage power; optimization technique; power consumption; static power; Benchmark testing; Delays; Electric breakdown; Graphics processing units; Instruction sets; Optimization; GPGPU; CUDA; Memory hierarchy; Cache; Power;
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
DOI :
10.1109/IPDPSW.2014.202