Author : 
Stratton, John A. ; Rodrigues, Christopher ; Sung, I-Jui ; Chang, Li-Wen ; Anssari, Nasser ; Liu, Geng ; Hwu, Wen-Mei W. ; Obeid, Nady
         
        
            Abstract : 
A study of the implementation patterns among massively threaded applications for many-core GPUs reveals that each of the seven most commonly used algorithm and data optimization techniques can enhance the performance of applicable kernels by 2 to 10× in current processors while also improving future scalability.
         
        
            Keywords : 
graphics processing units; multiprocessing systems; optimisation; data optimization techniques; many-core GPU; massively threaded systems; Bandwidth; Graphics processing unit; Histograms; Instruction sets; Multithreading; Optimization; System-on-a-chip; Parboil benchmarks; accelerators; massively threaded systems; optimization patterns; scalability;