Title of article :
SOLUTION OF LARGE LINEAR SYSTEMS ON PIPELINED SIMD MACHINES
Author/Authors :
NIKOLAUS GEERS، نويسنده , , ROLAND KLEES، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 1997
Pages :
16
From page :
2433
To page :
2448
Abstract :
We developed a direct out-of-core solver for dense non-symmetric linear systems of ÔarbitraryÕ size N]N. The algorithm fully employs the Basic Linear Algebra Subprograms (BLAS), and can therefore easily be adapted to di¤erent computer architectures by using the corresponding optimized routines. We used blocked versions of left-looking and right-looking variants of LU decomposition to perform most of the operations in Level 3 BLAS, to reduce the number of I/O operations and to minimize the CPU time usage. The storage requirements of the algorithm are only 2N]NB data elements where NB;N. Depending on the sustained ßoating point performance and the sustained I/O rate of the given hardware, we derived formulas that allow for choosing optimal values of NB to balance between CPU time and I/O time. We tested the algorithm by means of linear systems derived from 3D-BEM for strongly and weakly singular integral equations and from interpolation problems for scattered data on closed surfaces in R3. It took only about 2á5 CPU minutes on a 5 GFLOPS vector computer SNI S600/20 to solve a linear system of size 10 000, which corresponds to a performance of 4á3 GFLOPS; a value of NB"650 gives a reasonable I/O time and the necessary main storage size is about 13 Mwords. In addition, we compared the algorithm with (1) an out-of-core version of GMRES and (2) a wavelet transform followed by in-core GMRES after thresholding. At least for boundary integral equations of classical boundary value problems of potential theory, the out-of-core version of GMRES is superior to the direct out-of-core solver and the wavelet transform since the algorithm converged after at most 5 iteration steps. It took about 17 s to solve a system with 8192 unknowns compared with 146 s for direct out-of-core and 402 s for wavelet transform followed by in-core GMRES.
Keywords :
dense linear systems , LU decomposition , BLAS , BEM , out-of-core solver
Journal title :
International Journal for Numerical Methods in Engineering
Serial Year :
1997
Journal title :
International Journal for Numerical Methods in Engineering
Record number :
423367
Link To Document :
بازگشت