DocumentCode
3588751
Title
Atomic reduction based sparse matrix-transpose vector multiplication on GPUs
Author
Yuan Tao ; Yangdong Deng ; Shuai Mu ; Mingfa Zhu ; Limin Xiao ; Li Ruan ; Zhibin Huang
Author_Institution
State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
fYear
2014
Firstpage
987
Lastpage
992
Abstract
Sparse Matrix-Transpose Vector Product (SMTVP) is a frequently used computation pattern in High Performance Computing applications. It is typically solved by transposition followed by a Sparse Matrix-Vector Product (SMVP) in current linear algebra packages. However, the transposition process can be a serious bottleneck on modern parallel computing platforms. A previous work proposed a relatively complex data structure for efficiently computing SMTVP with multi-core CPUs, but it proved to be inefficient on GPUs. In this work, we show that the Compressed Sparse Row (CSR) based SMVP algorithm can also be efficient for SMTVP computation on modern GPUs. The proposed method exploits atomic operations to perform the reduce operation in the computation of each inner product of a row in the transposed matrix and the vector. Experimental results show that the simple technique can outperform the SMTVP flow of transposition plus SMVP released in the CUSPARSE package by up to 405-fold.
Keywords
graphics processing units; mathematics computing; matrix multiplication; sparse matrices; CSR based SMVP algorithm; CUSPARSE package; SMTVP computation; atomic reduction; compressed sparse row; sparse matrix-transpose vector multiplication; sparse matrix-transpose vector product; transposition plus SMVP; Graphics processing units; Indexes; Instruction sets; Laboratories; Sparse matrices; Throughput; CSR; GPU; atomic operation; compressed sparse row; sparse matrix-transpose vector product;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Systems (ICPADS), 2014 20th IEEE International Conference on
Type
conf
DOI
10.1109/PADSW.2014.7097920
Filename
7097920
Link To Document