Title : 
Warped register file: A power efficient register file for GPGPUs
         
        
            Author : 
Abdel-Majeed, M. ; Annavaram, Murali
         
        
            Author_Institution : 
Electr. Eng. Dept., Univ. of Southern California, Los Angeles, CA, USA
         
        
        
        
        
        
            Abstract : 
General purpose graphics processing units (GPGPUs) have the ability to execute hundreds of concurrent threads. To support massive parallelism GPGPUs provide a very large register file, even larger than a cache, to hold the state of each thread. As technology scales, the leakage power consumption of the SRAM cells is getting worse making the register file static power consumption a major concern. As the supply voltage scaling slows, dynamic power consumption of a register file is not reducing. These concerns are particularly acute in GPGPUs due to their large register file size. This paper presents two techniques to reduce the GPGPU register file power consumption. By exploiting the unique software execution model of GPGPUs, we propose a tri-modal register access control unit to reduce the leakage power. This unit first turns off any unallocated register, and places all allocated registers into drowsy state immediately after each access. The average inter-access distance to a register is 789 cycles in GPGPUs. Hence, aggressively moving a register into drowsy state immediately after each access results in 90% reduction in leakage power with negligible performance impact. To reduce dynamic power this paper proposes an active mask aware activity gating unit that avoids charging bit lines and wordlines of registers associated with all inactive threads within a warp. Due to insufficient parallelism and branch divergence warps have many inactive threads. Hence, registers associated with inactive threads can be identified precisely using the active mask. By combining the two techniques we show that the power consumption of the register file can be reduced by 69% on average.
         
        
            Keywords : 
authorisation; cache storage; concurrency control; graphics processing units; multi-threading; optimising compilers; performance evaluation; power aware computing; power consumption; GPGPU register file power consumption; SRAM cells; active mask aware activity gating unit; average interaccess distance; charging bit lines; concurrent threads; dynamic power consumption; general purpose graphics processing units; leakage power consumption; leakage power reduction; parallelism GPGPUs; power consumption; power efficient register file; register allocation; register file size; register file static power consumption; software execution model; supply voltage scaling; technology scales; tri-modal register access control unit; warped register file; Benchmark testing; Instruction sets; Microarchitecture; Parallel processing; Power demand; Registers; SRAM cells;
         
        
        
        
            Conference_Titel : 
High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on
         
        
            Conference_Location : 
Shenzhen
         
        
        
            Print_ISBN : 
978-1-4673-5585-8
         
        
        
            DOI : 
10.1109/HPCA.2013.6522337