مرکز منطقه ای اطلاع رساني علوم و فناوري - Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices Using Tile Algorithms on Multicore Architectures

DocumentCode :

3206283

Title :

Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices Using Tile Algorithms on Multicore Architectures

Author :

Luszczek, Piotr ; Ltaief, Hatem ; Dongarra, Jack

Author_Institution :

Innovative Comput. Lab., Univ. of Tennessee, Knoxville, TN, USA

fYear :

2011

fDate :

16-20 May 2011

Firstpage :

944

Lastpage :

955

Abstract :

While successful implementations have already been written for one-sided transformations (e.g., QR, LU and Cholesky factorizations) on multicore architecture, getting high performance for two-sided reductions (e.g., Hessenberg, tridiagonal and bidiagonal reductions) is still an open and difficult research problem due to expensive memory-bound operations occurring during the panel factorization. The processor memory speed gap continues to widen, which has even further exacerbated the problem. This paper focuses on an efficient implementation of the tridiagonal reduction, which is the first algorithmic step toward computing the spectral decomposition of a dense symmetric matrix. The original matrix is translated into a tile layout i.e., a high performance data representation, which substantially enhances data locality. Following a two stage approach, the tile matrix is then transformed into band tridiagonal form using compute intensive kernels. The band form is further reduced to the required tridiagonal form using a left-looking bulge chasing technique to reduce memory traffic and memory contention. A dependence translation layer associated with a dynamic runtime system allows for scheduling and overlapping tasks generated from both stages. The obtained tile tridiagonal reduction significantly outperforms the state-of-the art numerical libraries (10X against multithreaded LAPACK with optimized MKL BLAS and 2.5X against the commercial numerical software Intel MKL) from medium to large matrix sizes.

Keywords :

matrix decomposition; multiprocessing systems; Hessenberg reduction; band tridiagonal matrix; bidiagonal reduction; dense symmetric matrix; dependence translation layer; dynamic runtime system; left-looking bulge chasing technique; matrix decomposition; multicore architecture; panel factorization; tile algorithm; tile matrix; two-stage tridiagonal reduction; Computer architecture; Decision support systems; Kernel; Layout; Libraries; Symmetric matrices; Tiles;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International

Conference_Location :

Anchorage, AK

ISSN :

1530-2075

Print_ISBN :

978-1-61284-372-8

Electronic_ISBN :

1530-2075

Type :

conf

DOI :

10.1109/IPDPS.2011.91

Filename :

6012903

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3206283