DocumentCode :
3091022
Title :
Fast Sparse Matrix Matrix Product Based on ELLR-T and GPU Computing
Author :
Vazquez, Francisco ; Ortega, G. ; Fernandez, J. Javier ; García, I. ; Garzón, E.M.
Author_Institution :
Supercomput. & Algorithms Group, Univ. of Almerfa, Almerfa, Spain
fYear :
2012
fDate :
10-13 July 2012
Firstpage :
669
Lastpage :
674
Abstract :
A wide range of applications in engineering and scientific computing are based on the computation of matrices products, where one of them is sparse. The computational requirements of these operations are very high when dimensions of the matrices increase. The goal of this work is the acceleration of the sparse matrix matrix product (SpMM) on Graphics Processing Units (GPUs). The operation SpMM can be computed by a set of sparse matrix vector operations (SpMV). However, this approach does not reach optimal performance because it cannot benefit from the large value of the ratio computation/memory access associated to the SpMM operation. In this work a routine called FastSpMM is described and its performance evaluated. FastSpMM can be considered as an extension of the ELLRT routine to compute SpMV on GPUs which is based on the ELLPACK-R storage format for sparse matrices. FastSpMM combines the high ratio computation/memory access with the advantages of ELLR-T to exploit the GPU architecture. The CUSPARSE library, supplied by NVIDIA, which also includes routines to compute SpMM on GPUs is used in this work as a reference for performance comparison. Experimental evaluations based on a representative set of test matrices show that FastSpMM outperforms the corresponding CUSPARSE routine in terms of performance.
Keywords :
graphics processing units; sparse matrices; CUSPARSE library; ELLR-T computing; FastSpMM; GPU computing; NVIDIA; SpMM; computational requirements; engineering computing; fast sparse matrix matrix product; graphics processing units; optimal performance; scientific computing; Acceleration; Computer architecture; Graphics processing unit; Instruction sets; Libraries; Sparse matrices; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing with Applications (ISPA), 2012 IEEE 10th International Symposium on
Conference_Location :
Leganes
Print_ISBN :
978-1-4673-1631-6
Type :
conf
DOI :
10.1109/ISPA.2012.99
Filename :
6280359
Link To Document :
بازگشت