DocumentCode :
1866773
Title :
A new parallel matrix multiplication algorithm on distributed-memory concurrent computers
Author :
Choi, Jaeyoung
Author_Institution :
Sch. of Comput., Soongsil Univ., Seoul, South Korea
fYear :
1997
fDate :
28 Apr-2 May 1997
Firstpage :
224
Lastpage :
229
Abstract :
The author presents a new parallel matrix multiplication algorithm on distributed memory concurrent computers, which is fast and scalable, and whose performance is independent of data distribution on processors, and calls it DIMMA (Distribution-Independent Matrix Multiplication Algorithm). The algorithm is based on two new ideas; it uses a modified pipelined communication scheme to overlap computation and communication effectively and exploits the LCM block concept to obtain the maximum performance of the sequential BLAS routine in each processor even when the block size is very small as well as very large. The algorithm is implemented and compared with SUMMA on the Intel Paragon computer
Keywords :
computational complexity; distributed memory systems; matrix multiplication; parallel algorithms; parallel machines; pipeline processing; DIMMA; Intel Paragon computer; LCM block concept; SUMMA; computation/communication overlap; distributed-memory concurrent computers; distribution-independent matrix multiplication algorithm; maximum performance; modified pipelined communication scheme; parallel matrix multiplication algorithm; processor; sequential BLAS routine; Broadcasting; Concurrent computing; Design optimization; Distributed computing; Grid computing; Jacobian matrices; Linear algebra; Wrapping;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing on the Information Superhighway, 1997. HPC Asia '97
Conference_Location :
Seoul
Print_ISBN :
0-8186-7901-8
Type :
conf
DOI :
10.1109/HPC.1997.592151
Filename :
592151
Link To Document :
بازگشت