DocumentCode
580121
Title
Implementing sparse matrix-vector multiplication on throughput-oriented processors
Author
Bell, Nancy ; Garland, Michael
fYear
2009
fDate
14-20 Nov. 2009
Firstpage
1
Lastpage
11
Abstract
Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential of throughput-oriented processors for sparse operations requires that we expose substantial fine-grained parallelism and impose sufficient regularity on execution paths and memory access patterns. We explore SpMV methods that are well-suited to throughput-oriented architectures like the GPU and which exploit several common sparsity classes. The techniques we propose are efficient, successfully utilizing large percentages of peak bandwidth. Furthermore, they deliver excellent total throughput, averaging 16 GFLOP/s and 10 GFLOP/s in double precision for structured grid and unstructured mesh matrices, respectively, on a GeForce GTX 285. This is roughly 2.8 times the throughput previously achieved on Cell BE and more than 10 times that of a quad-core Intel Clovertown system.
Keywords
graphics processing units; linear algebra; matrix multiplication; parallel architectures; sparse matrices; vectors; Cell BE; GPU; GeForce GTX 285; SpMV; dense linear algebra; peak bandwidth; quad-core Intel Clovertown system; sparse linear algebra; sparse matrix-vector multiplication; structured grid; throughput-oriented architecture; throughput-oriented processor; unstructured mesh matrices;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on
Conference_Location
Portland, OR
Type
conf
DOI
10.1145/1654059.1654078
Filename
6375570
Link To Document