DocumentCode :
1830667
Title :
An Efficient Sparse Matrix Multiplication for Skewed Matrix on GPU
Author :
Shah, Mubarak ; Patel, Vaibhav
fYear :
2012
fDate :
25-27 June 2012
Firstpage :
1301
Lastpage :
1306
Abstract :
This paper presents a new sparse matrix format ALIGNED_COO, an extension to COO format to optimize performance of large sparse matrix having skewed distribution of non-zero elements. Load balancing, alignment and synchronization free distribution of work load are three important factors to improve performance of sparse matrices representing power-law graph. Coordinate (COO) format is selected for extension in this paper as it is the most suitable format for sparse matrices representing power-law graph. The ALIGNED_COO format tries to set maximum alignment across the computing resources. Our heuristic to decide degree of concurrency is different from the existing approaches. Despite the availability of other popular sparse formats, ALIGNED_COO format helps to gain better performance without any extra memory overhead. Our approach not only achieves higher performance on skewed matrices with power-law distribution, but also gives appreciable performance for wide range of sparse matrix patterns. The proposed implementation of SpMV kernel for ALIGNED_COO sparse format helps to achieve 1.0-25.72 times higher performance than COO_flat kernel with increase in the level of accuracy. The average performance gain over other sparse formats is in tolerable range of 0.89-48.8.
Keywords :
concurrency control; graphics processing units; mathematics computing; matrix multiplication; parallel programming; performance evaluation; resource allocation; sparse matrices; vectors; ALIGNED_COO sparse format; GPU; SpMV kernel; computing resources; coordinate format; large sparse matrix performance optimization; load balancing; performance gain; performance improvement; power-law distribution; power-law graph; skewed matrix; sparse matrix format; sparse matrix patterns; sparse matrix vector multiplication; synchronization free distribution; work load alignment; Graphics processing unit; Indexes; Instruction sets; Kernel; Load management; Sparse matrices; Vectors; Load balance; Power-law graph; SpMV;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
Conference_Location :
Liverpool
Print_ISBN :
978-1-4673-2164-8
Type :
conf
DOI :
10.1109/HPCC.2012.192
Filename :
6332328
Link To Document :
بازگشت