DocumentCode :
572399
Title :
Energy-efficient mechanisms for managing thread context in throughput processors
Author :
Gebhart, Mark ; Johnson, Daniel R. ; Tarjan, David ; Keckler, Stephen W. ; Dally, William J. ; Lindholm, Erik ; Skadron, Kevin
Author_Institution :
Univ. of Texas at Austin, Austin, TX, USA
fYear :
2011
fDate :
4-8 June 2011
Firstpage :
235
Lastpage :
246
Abstract :
Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine register file caching to replace accesses to the large main register file with accesses to a smaller structure containing the immediate register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of temporary register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread register file cache reduces the number of reads and writes to the main register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of register file energy.
Keywords :
graphics processing units; multi-threading; power aware computing; GPU; energy efficient mechanisms; file caching; file register; graphics processing units; hardware threads; main memory latency; memory access latency; thread context management; thread scheduler; throughput processors; Computer architecture; Graphics; Graphics processing unit; Instruction sets; Pipelines; Registers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture (ISCA), 2011 38th Annual International Symposium on
Conference_Location :
San Jose, CA
ISSN :
1063-6897
Print_ISBN :
978-1-4503-0472-6
Type :
conf
Filename :
6307762
Link To Document :
بازگشت