Title :
Parallel Sparse Approximate Inverse Preconditioning on Graphic Processing Units
Author :
Dehnavi, Maryam Mehri ; Fernandez, Daniel Moses ; Gaudiot, Jean-Luc ; Giannacopoulos, Dennis D.
Author_Institution :
Massachusetts Inst. of Technol., Cambridge, MA, USA
Abstract :
Accelerating numerical algorithms for solving sparse linear systems on parallel architectures has attracted the attention of many researchers due to their applicability to many engineering and scientific problems. The solution of sparse systems often dominates the overall execution time of such problems and is mainly solved by iterative methods. Preconditioners are used to accelerate the convergence rate of these solvers and reduce the total execution time. Sparse approximate inverse (SAI) preconditioners are a popular class of preconditioners designed to improve the condition number of large sparse matrices. We propose a GPU accelerated SAI preconditioning technique called GSAI, which parallelizes the computation of this preconditioner on NVIDIA graphic cards. The preconditioner is then used to enhance the convergence rate of the BiConjugate Gradient Stabilized (BiCGStab) iterative solver on the GPU. The SAI preconditioner is generated on average 28 and 23 times faster on the NVIDIA GTX480 and TESLA M2070 graphic cards, respectively, compared to ParaSails (a popular implementation of SAI preconditioners on CPU) single processor/core results. The proposed GSAI technique computes the SAI preconditioner in approximately the same time as ParaSails generates the same preconditioner on 16 AMD Opteron 252 processors.
Keywords :
digital arithmetic; graphics processing units; iterative methods; parallel architectures; AMD Opteron 252 processors; BiCGStab; GPU accelerated SAI preconditioning technique; GSAI; NVIDIA GTX480; NVIDIA graphic cards; ParaSails; TESLA M2070; biconjugate gradient stabilized iterative solver; engineering problems; graphic processing units; numerical algorithms; parallel architectures; parallel sparse approximate inverse preconditioning; scientific problems; single processor-core results; sparse linear systems; Acceleration; Data structures; Graphics processing unit; Kernel; Memory management; Sparse matrices; Vectors; Numerical algorithms; conditioning; graphics processors; parallel algorithms; parallel programming;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2012.286