Title :
Matrix bidiagonalization on the Trident processor
Author :
Soliman, Mostafa I. ; Sedukhin, Stanislav G.
Author_Institution :
Graduate Sch. of Comput. Sci. & Eng., Univ. of Aizu, Fukushima, Japan
Abstract :
This paper discusses the implementation and evaluation of the reduction of a dense matrix to bidiagonal form on the Trident processor. The standard Golub and Kahan Householder bidiagonalization algorithm, which is rich in matrix-vector operations, and the LAPACK subroutine _GEBRD, which is rich in a mixture of vector, matrix-vector, and matrix operations, are simulated on the Trident processor. We show how to use the Trident parallel execution units, ring, and communication registers to effectively perform vector, matrix-vector, and matrix operations needed for bidiagonalizing a matrix. The number of clock cycles per FLOP is used as a metric to evaluate the performance of the Trident processor. Our results show that increasing the number of the Trident lanes proportionally decreases the number of cycles needed per FLOP. On a 32 K×32 K matrix and 128 Trident lanes, the speedup of using matrix-vector operations on the standard Golub and Kahan algorithm is around 1.5 times over using vector operations. However, using matrix operations on the GEBRD subroutine gives speedup around 3 times over vector operations, and 2 times over using matrix-vector operations on the standard Golub and Kahan algorithm.
Keywords :
matrix decomposition; parallel algorithms; parallel architectures; performance evaluation; simulation; subroutines; vectors; BLAS; GEBRD; Golub and Kahan Householder algorithm; LAPACK subroutine; Trident processor; communication registers; dense matrix; matrix bidiagonalization; matrix-vector operations; parallel execution units; performance; ring; scalable architecture; simulation; speedup; Algorithms; Architecture; Cities and towns; Clocks; Computer science; Hardware; Matrix decomposition; Parallel processing; Parallel programming; Registers;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2003. Proceedings. International
Print_ISBN :
0-7695-1926-1
DOI :
10.1109/IPDPS.2003.1213467