Title :
SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms
Author :
Frey, Steffen ; Reina, Guido ; Ertl, Thomas
Author_Institution :
Visualization Res. Center, Univ. of Stuttgart (VISUS), Stuttgart, Germany
Abstract :
The global scheduler of a current GPU distributes thread blocks to symmetric multiprocessors (SM), which schedule threads for execution with the granularity of a warp. Threads in a warp execute the same code path in lockstep, which potentially leads to a large amount of wasted cycles for divergent control flow. In order to overcome this general issue of SIMT architectures, we propose techniques to relax divergence on the fly within a computation kernel in order to achieve a much higher total utilization of processing cores. We propose techniques for branch and loop divergence (which may also be combined) switching to suitable tasks during a GPU kernel run every time divergence occurs. Our newly introduced techniques can easily be applied to arbitrary iterative algorithms and we evaluate the performance and effectiveness of our approach exemplarily via synthetic and real world applications.
Keywords :
graphics processing units; iterative methods; mathematics computing; multiprocessing systems; scheduling; GPU global scheduler; SIMT microscheduling; branch-and-loop divergence technique; computation kernel; divergent control flow; divergent iterative algorithm; graphics processing unit; processing core utilization; symmetric multiprocessor; thread block scheduling; thread stalling reduction; warp granularity; Context; Graphics processing unit; Hardware; Instruction sets; Kernel; Memory management; Switches; Divergence; GPU; Scheduling;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2012 20th Euromicro International Conference on
Conference_Location :
Garching
Print_ISBN :
978-1-4673-0226-5
DOI :
10.1109/PDP.2012.62