Title :
An Improved Systolic Architecture for LU Decomposition
Author :
Kim, DaeGon ; Rajopadhye, Sanjay V.
Author_Institution :
Colorado State University
Abstract :
LU-Decomposition is a classic problem for which many systolic array implementations have been proposed, the best of which takes 3n-3 time on n2/2 PEs, for a dense, n¿n matrix. In this paper, we first give a proof that if only nearest neighbor communication is allowed, this time is a lower bound. We then generalize it to 2n + n/k - 3 time if one allows k-bounded broadcasts (i.e., if it takes m/k time steps to broadcast a value to m destination nodes). We also present a new architecture with this improved execution time, which uses n2/2 PEs, each one consisting of two multiplier-subtractor units, but active only on alternate cycles. This leads to a speedup and efficiency of kn2/(6k+3) and 2k/(6k + 3) respectively. For k = 1, our proposed architecture achieves the performance of the best previously known systolic array implementation. Special cases of our results include similar improvements to algorithms for solving (upper and lower) triangular linear systems by (forward and backward) substitution.
Keywords :
Broadcasting; Computer architecture; Computer science; Delay; Electronic mail; Equations; Linear systems; Matrix decomposition; Nearest neighbor searches; Systolic arrays;
Conference_Titel :
Application-specific Systems, Architectures and Processors, 2006. ASAP '06. International Conference on
Conference_Location :
Steamboat Springs, CO
Print_ISBN :
0-7695-2682-9
DOI :
10.1109/ASAP.2006.12