DocumentCode :
2438194
Title :
Exploring shared memory and cache to improve GPU performance and energy efficiency
Author :
Hao Wen ; Wei Zhang
Author_Institution :
Dept. of Electr. & Comput. Eng., Virginia Commonwealth Univ., Richmond, CA, USA
fYear :
2015
fDate :
2-4 March 2015
Firstpage :
402
Lastpage :
405
Abstract :
Graphic Processing Units(GPU) use multiple, multithreaded, SIMD cores to exploit data parallelism to boost performance. State-of-the-art GPUs use configurable shared memory and cache to improve performance for applications with different access patterns. Unlike CPU programs, GPU programs usually exhibit different access patterns, whose performance may not be heavily dependent on the cache access latencies. On the other hand, the shared memory capacity and other execution resources may become limiting factors to the parallelism, which can significantly affect performance. In this paper, we evaluate the impact of different shared memory and cache configurations on both the performance and energy consumption, which can provide useful insights for GPU programmers to use the configurable shared memory and cache more effectively.
Keywords :
cache storage; graphics processing units; shared memory systems; GPU; cache access latencies; configurable shared memory; data parallelism; graphic processing units; multithreaded SIMD cores; shared memory capacity; Benchmark testing; Graphics processing units; Instruction sets; Limiting; Loading; Memory management; Parallel processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Quality Electronic Design (ISQED), 2015 16th International Symposium on
Conference_Location :
Santa Clara, CA
Print_ISBN :
978-1-4799-7580-8
Type :
conf
DOI :
10.1109/ISQED.2015.7085459
Filename :
7085459
Link To Document :
بازگشت