DocumentCode :
576809
Title :
Tuning Block Size for QR Factorization on CPU-GPU Hybrid Systems
Author :
Tsai, Yaohung M. ; Wang, Weichung ; Chen, Ray-Bing
Author_Institution :
Dept. of Math., Nat. Taiwan Univ., Taipei, Taiwan
fYear :
2012
fDate :
20-22 Sept. 2012
Firstpage :
205
Lastpage :
211
Abstract :
In CPU-GPU hybrid systems, the QR factorization in MAGMA results in CPU idle due to the fixed block size. To improve the computational efficiency of MAGMA QR factorization, we propose a variable block size auto-tuning scheme on CPU-GPU hybrid systems. First, we fit the CPU and GPU costs in MAGMA QR factorization via two independent regression models as CPU and GPU performance models. Next, we propose a block size optimization scheme to tune the block size adaptively and therefore to minimize a cost objective function. The cost objective function is designed to balance the workloads between CPU and GPU based on the performance models. Finally, several numerical results demonstrate the performance gains due to the novel QR factorization algorithm.
Keywords :
graphics processing units; matrix decomposition; multiprocessing systems; CPU costs; CPU performance models; CPU-GPU hybrid systems; GPU costs; GPU performance models; MAGMA QR factorization; block size optimization scheme; computational efficiency improvement; fixed block size; independent regression models; matrix algebra-on-GPU-and-multicore architectures; tuning block size; variable block size auto-tuning scheme; workload balancing; Algorithms; Central Processing Unit; Computer architecture; Graphics processing units; Matrix decomposition; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Embedded Multicore Socs (MCSoC), 2012 IEEE 6th International Symposium on
Conference_Location :
Aizu-Wakamatsu
Print_ISBN :
978-1-4673-2535-6
Electronic_ISBN :
978-0-7695-4800-5
Type :
conf
DOI :
10.1109/MCSoC.2012.32
Filename :
6354700
Link To Document :
بازگشت