DocumentCode :
506179
Title :
Vector and parallel algorithms for Cholesky factorization on IBM 3090
Author :
Agarwal, Ramesh C. ; Gustavson, Fred G.
Author_Institution :
I.B.M. Research Division, Thomas J. Watson Research Center, Yorktown Hts., New York
fYear :
1989
fDate :
12-17 Nov. 1989
Firstpage :
225
Lastpage :
233
Abstract :
In many engineering applications, a solution of Fx = b is required, where F is a positive definite symmetric matrix. This is usually done by the Cholesky factorization, F = RRT, where R is the lower triangular Cholesky factor. This is a compute intensive problem. However, in order to achieve the best possible performance on IBM 3090 Vector Facility, the problem requires blocking at various levels to match 3090 memory hierarchy. A large problem which does not fit in a particular level of memory is blocked so that each block fits in memory. This minimizes data transfers between various levels of memory. In this paper, various blocking schemes are described for vector and parallel implementation on 3090 VF. Some of these algorithms have been included in the Engineering and Scientific Subroutine Library (ESSL). Performance numbers are also included. These algorithms achieve close to the peak performance of the 3090 uniprocessor and multiprocessors.
Keywords :
Algorithm design and analysis; Parallel algorithms; Registers; Symmetric matrices; Tiles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Supercomputing, 1989. Supercomputing '89. Proceedings of the 1989 ACM/IEEE Conference on
Conference_Location :
Reno, NV, United States
Print_ISBN :
0-89791-341-8
Type :
conf
DOI :
10.1145/76263.76287
Filename :
5349014
Link To Document :
بازگشت