Title :
HLanc: Heterogeneous Parallel Implementation of the Implicitly Restarted Lanczos Method
Author :
Shuai Zhang ; Tao Li ; Xiaofan Jiao ; Yifeng Wang ; Yulu Yang
Author_Institution :
Coll. of Comput. & Control Eng., Nankai Univ., Tianjin, China
Abstract :
Graphics Processing Unit (GPU) has been used as a ubiquitous accelerator for general purpose computing, such as linear algebra routines and numerical methods. The implicitly restarted Lanczos method (IRLM) is well suited for solving the partial eigenvalue problem for large symmetric sparse matrices, which is important in many real world applications. In this paper, we present the HLanc library, a parallel implementation of IRLM on the heterogeneous CPU-GPU architecture employing the CUDA programming model. The HLanc library is designed with separated heterogeneous parallel IRLM solvers and sparse matrix-vector multiplication (SPMV) operators. The SPMV operators hide the details about the storage of sparse matrices from the IRLM solvers, so the solvers can work with any spare matrix formats. Especially the SPMV operators and IRLM solvers can be combined arbitrarily for achieving the best performance of CPU-GPU heterogeneous system. The HLanc is evaluated using eight sparse matrices with the NVIDIA GTX 480 and GTX TITAN Black GPUs. The results show that HLanc achieves 15 times speedup than the ARPACK library and scales well across different GPU generations.
Keywords :
eigenvalues and eigenfunctions; graphics processing units; mathematics computing; matrix multiplication; parallel architectures; parallel programming; software libraries; sparse matrices; vectors; CPU-GPU heterogeneous system; CUDA programming model; GTX TITAN Black GPUs; HLanc library; NVIDIA GTX 480; SPMV operators; general purpose computing; graphics processing unit; heterogeneous CPU-GPU architecture; heterogeneous parallel IRLM solvers; heterogeneous parallel implementation; implicitly restarted Lanczos method; linear algebra routines; numerical methods; partial eigenvalue problem; sparse matrix-vector multiplication operators; symmetric sparse matrices; ubiquitous accelerator; Computer architecture; Eigenvalues and eigenfunctions; Graphics processing units; Hardware; Libraries; Sparse matrices; Symmetric matrices; CUDA; GPU; HLanc; IRLM; SPMV; symmetric sparse matrix;
Conference_Titel :
Parallel Processing Workshops (ICCPW), 2014 43rd International Conference on
DOI :
10.1109/ICPPW.2014.60