Title :
Tuning a Hybrid GPU-CPU V-Cycle Multilevel Preconditioner for Solving Large Real and Complex Systems of FEM Equations
Author :
Dziekonski, Adam ; Lamecki, Adam ; Mrozowski, Michal
Author_Institution :
Dept. of Microwave & Antenna Eng., Gdansk Univ. of Technol., Gdansk, Poland
fDate :
7/3/1905 12:00:00 AM
Abstract :
This letter presents techniques for tuning an accelerated preconditioned conjugate gradient solver with a multilevel preconditioner. The solver is optimized for a fast solution of sparse systems of equations arising in computational electromagnetics in a finite element method using higher-order elements. The goal of the tuning is to increase the throughput while at the same time reducing the memory requirements in order to allow one to process very large complex or real systems in single and double precision using commodity graphic processing units (GPUs). A threefold memory footprint reduction is achieved by means of a new format of storing sparse matrices. The acceleration is achieved by optimizing a sparse matrix-vector product on a GPU by applying new features of the Fermi architecture. Further improvements are obtained by introducing more levels into the preconditioner and the application of a fast sparse direct solver for the operations executed on a CPU. Numerical results for a setup consisting of a Fermi GPU (GTX 480) and a Xeon six-core CPU showed that the proposed approach allows one to handle systems involving millions of unknowns and reach the speedup factor of almost 4 compared to the CPU-only implementation.
Keywords :
computational electromagnetics; computer graphic equipment; conjugate gradient methods; coprocessors; finite element analysis; higher order statistics; optimisation; sparse matrices; CPU; FEM equations; GPU; V-cycle multilevel preconditioner; complex systems; computational electromagnetics; finite element method; graphic processing units; higher-order elements; optimization; preconditioned conjugate gradient solver; real systems; sparse matrix vector product; Acceleration; Finite difference methods; Finite element methods; Graphics processing unit; Memory management; Sparse matrices; Tuning; Graphic processing unit (GPU); PARDISO; multilevel preconditioners; sparse matrix-vector product (SpMV);
Journal_Title :
Antennas and Wireless Propagation Letters, IEEE
DOI :
10.1109/LAWP.2011.2159769