An Efficient Sparse Matrix Multiplication for Skewed Matrix on GPU

Author

Shah, Mubarak ; Patel, Vaibhav

fYear

2012

fDate

25-27 June 2012

Firstpage

1301

Lastpage

1306

Abstract

This paper presents a new sparse matrix format ALIGNED_COO, an extension to COO format to optimize performance of large sparse matrix having skewed distribution of non-zero elements. Load balancing, alignment and synchronization free distribution of work load are three important factors to improve performance of sparse matrices representing power-law graph. Coordinate (COO) format is selected for extension in this paper as it is the most suitable format for sparse matrices representing power-law graph. The ALIGNED_COO format tries to set maximum alignment across the computing resources. Our heuristic to decide degree of concurrency is different from the existing approaches. Despite the availability of other popular sparse formats, ALIGNED_COO format helps to gain better performance without any extra memory overhead. Our approach not only achieves higher performance on skewed matrices with power-law distribution, but also gives appreciable performance for wide range of sparse matrix patterns. The proposed implementation of SpMV kernel for ALIGNED_COO sparse format helps to achieve 1.0-25.72 times higher performance than COO_flat kernel with increase in the level of accuracy. The average performance gain over other sparse formats is in tolerable range of 0.89-48.8.

Keywords

concurrency control; graphics processing units; mathematics computing; matrix multiplication; parallel programming; performance evaluation; resource allocation; sparse matrices; vectors; ALIGNED_COO sparse format; GPU; SpMV kernel; computing resources; coordinate format; large sparse matrix performance optimization; load balancing; performance gain; performance improvement; power-law distribution; power-law graph; skewed matrix; sparse matrix format; sparse matrix patterns; sparse matrix vector multiplication; synchronization free distribution; work load alignment; Graphics processing unit; Indexes; Instruction sets; Kernel; Load management; Sparse matrices; Vectors; Load balance; Power-law graph; SpMV;

fLanguage

English

Publisher

ieee

Conference_Titel

High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on

Conference_Location

Liverpool

Print_ISBN

978-1-4673-2164-8

Type

conf

DOI

10.1109/HPCC.2012.192

Filename

6332328