Title :
Data distribution and communication schemes for IQMR method on massively distributed memory computers
Author :
Yang, Laurence Tianruo
Author_Institution :
St. Francis Xavire Univ., Antigonish, NS, Canada
Abstract :
We study the parallelization of the IQMR method for the solutions of linear systems of equations with unsymmetric coefficient matrices. The IQMR method is an improved version of the quasi-minimal residual (IQMR) method by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. The algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent and communication time required for the inner product can be overlapped efficiently with computation time. Two important schemes are discussed. What is the best possible data distribution and which communication network topology is most suitable for the IQMR method on massively parallel distributed memory computers. A theoretical model of data distribution and communication phases is presented mainly based on (Hoekstra et al., 1991; 1992) which allows us to give a detailed execution time complexity analysis and investigates its usefulness. It is shown that the implementation of IQMR, with a row-block decomposition of the coefficient matrix, on a ring of communication structure is the most efficient choice. Performance tests of the developed parallel IQMR algorithm have been carried out on the massively distributed memory system and experimental timing results are compared with the theoretical execution time complexity analysis
Keywords :
computational complexity; data communication; distributed memory systems; mathematics computing; matrix algebra; numerical stability; parallel algorithms; parallel machines; performance evaluation; IQMR method; Lanczos process; communication network topology; communication time; computation time; data communication; data distribution; execution time complexity analysis; linear systems of equations; massively parallel distributed memory computers; matrix-vector multiplications; numerical stability; parallel algorithm design; performance tests; quasi-minimal residual method; timing; unsymmetric coefficient matrices; Algorithm design and analysis; Communication networks; Computer networks; Concurrent computing; Distributed computing; Equations; Linear systems; Network topology; Numerical stability; Parallel algorithms;
Conference_Titel :
Parallel Processing, 2000. Proceedings. 2000 International Workshops on
Conference_Location :
Toronto, Ont.
Print_ISBN :
0-7695-0771-9
DOI :
10.1109/ICPPW.2000.869116