Title of article :
Improving the numerical stability and the performance of a parallel sparse solver
Author/Authors :
A. C. N. van Duin، نويسنده , , P. C. Hansen and S. H. Jensen، نويسنده , , Tz. Ostromsky، نويسنده , , H. Wijshoff، نويسنده , , Z. Zlatev، نويسنده ,
Issue Information :
دوهفته نامه با شماره پیاپی سال 1995
Pages :
16
From page :
81
To page :
96
Abstract :
Coarse grain parallel codes for solving sparse systems of linear algebraic equations can be developed in several different ways. The following procedure is suitable for some parallel computers. A preliminary reordering of the matrix is first applied to move as many zero elements as possible to the lower left corner. After that the matrix is partitioned into large blocks and the blocks in the lower left corner contain only zero elements. An attempt to obtain a good load-balance is carried out by allowing the diagonal blocks to be rectangular. While the algorithm based on the above ideas has good parallel properties, some stability problems may arise during the factorization because the pivotal search is restricted to the diagonal blocks. A simple a priori procedure has been used in a previous version in an attempt to stabilize the algorithm. In this paper it is shown that three enhanced stability devices can successfully be incorporated in the algorithm so that it is further stabilized and, moreover, the parallel properties of the original algorithm are preserved. The first device is based on a dynamic check of the stability. In the second device a slightly modified reordering is used in an attempt to get more nonzero elements in the diagonal blocks (the number of candidates for pivots tends to increase in this situation and, therefore, there is a better chance to select more stable pivots). The third device applies a P5-like ordering as a secondary criterion in the basic reordering procedure. This tends to improve the reordering and the performance of the solver. Moreover, the device is stable, while the original P5 ordering is often unstable. Numerical results obtained by using the three new devices are presented. The well-known sparse matrices from the Harwell-Boeing set are used in the experiments.
Keywords :
Gaussian elimination , Reordering , General sparsity , Coarse-grain parallelism , Drop-tolerance , Speed-up , Block algorithm , Sparse matrix
Journal title :
Computers and Mathematics with Applications
Serial Year :
1995
Journal title :
Computers and Mathematics with Applications
Record number :
917691
Link To Document :
بازگشت