Title :
Bisection and twisted SVD on GPU
Author :
Lu He; Yan Luo;Rui Liu; Hengyong Yu; Yu Cao; Xuzhou Chen; Seung Woo Son
Author_Institution :
University of Massachusetts Lowell, 01854, United States
Abstract :
Singular value decomposition (SVD) is one of the most important factorizations in matrix computation. However, computing SVD is still time-consuming, especially when the dimension of matrices exceeds tens of thousands. In this paper, we present a high performance approach called “Bisection and Twisted” (BT) for solving bidiagonal SVD. As modern general purpose GPUs have shown their extreme computational advantages in parallel computing, we implement the BT algorithm on single and multiple GPUs. With our carefully designed GPU kernels, the BT algorithm is about 10 times faster than MKL divide-and-conquer routine DBDSDC on an 8-core 2.53GHz CPU, and 36 times faster than CULA QR routine DBDSQR on the same GPUs. Additionally, the BT algorithm is able to compute SVD for matrices of size 1 million by 1 million with only two GPUs. To the best of our knowledge, no implementation has achieved such a scale.
Keywords :
"Graphics processing units","Algorithm design and analysis","Signal processing algorithms","Kernel","Optimization","Instruction sets","Matrix decomposition"
Conference_Titel :
High Performance Extreme Computing Conference (HPEC), 2015 IEEE
DOI :
10.1109/HPEC.2015.7322479