مرکز منطقه ای اطلاع رساني علوم و فناوري - Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes

DocumentCode :

692872

Title :

Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes

Author :

Wai Teng Tang ; Wen Jun Tan ; Ray, Ruben ; Yi Wen Wong ; Weiguang Chen ; Kuo, Shin-Hong ; Goh, Rick Siow Mong ; Turner, Stephen John ; Weng-Fai Wong

Author_Institution :

Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore

fYear :

2013

fDate :

17-22 Nov. 2013

Firstpage :

Lastpage :

Abstract :

The sparse matrix-vector (SpMV) multiplication routine is an important building block used in many iterative algorithms for solving scientific and engineering problems. One of the main challenges of SpMV is its memory-boundedness. Although compression has been proposed previously to improve SpMV performance on CPUs, its use has not been demonstrated on the GPU because of the serial nature of many compression and decompression schemes. In this paper, we introduce a family of bit-representation-optimized (BRO) compression schemes for representing sparse matrices on GPUs. The proposed schemes, BRO-ELL, BRO-COO, and BRO-HYB, perform compression on index data and help to speed up SpMV on GPUs through reduction of memory traffic. Furthermore, we formulate a BRO-aware matrix reodering scheme as a data clustering problem and use it to increase compression ratios. With the proposed schemes, experiments show that average speedups of 1.5× compared to ELLPACK and HYB can be achieved for SpMV on GPUs.

Keywords :

graphics processing units; iterative methods; mathematics computing; matrix multiplication; pattern clustering; sparse matrices; BRO compression schemes; BRO-COO; BRO-ELL; BRO-HYB; BRO-aware matrix reordering scheme; GPU; SpMV; bit-representation-optimized compression schemes; bit-representation-optimized schemes; data clustering problem; decompression schemes; engineering problems; iterative algorithms; memory traffic reduction; memory-boundedness; scientific problems; sparse matrix-vector multiplication routine; Abstracts; Acceleration; Educational institutions; Indexes; Instruction sets; Optimization; GPU; Sparse matrix format; data compression; matrix-vector multiplication; memory bandwidth; parallelism;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computing, Networking, Storage and Analysis (SC), 2013 International Conference for

Conference_Location :

Denver, CO

Print_ISBN :

978-1-4503-2378-9

Type :

conf

DOI :

10.1145/2503210.2503234

Filename :

6877459

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=692872