Title :
Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs
Author :
Wangdong Yang ; Kenli Li ; Zeyao Mo ; Keqin Li
Author_Institution :
Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
Abstract :
This paper presents a sparse matrix partitioning strategy to improve the performance of SpMV on GPUs and multicore CPUs. This method has wide adaptability for different types of sparse matrices, and is different from existing methods which only adapt to some particular sparse matrices. In addition, our partitioning method can obtain dense blocks by analyzing the probability distribution of non-zero elements in a sparse matrix, and result in very low proportion of zero padded. We make the following significant contributions. (1) We present a partitioning strategy of sparse matrices based on probabilistic modeling of non-zero elements in a row. (2) We prove that our method has the highest mean density compared with other strategies according to certain given ratios of partition obtained from the computing powers of heterogeneous processors. (3) We develop a CPU-GPU hybrid parallel computing model for SpMV on GPUs and multicore CPUs in a heterogeneous computing platform. Our partitioning strategy has balanced load distribution and the performance of SpMV is significantly improved when a sparse matrix is partitioned into dense blocks using our method. The average performance improvement of our solution for SpMV is about 15.75 percent on multicore CPUs, compared to that of the other solutions. By considering the rows of a matrix in a unique order based on the probability mass function of the number of non-zeros in a row, the average performance improvement of our solution for SpMV is about 33.52 percent on GPUs and multicore CPUs of a heterogeneous computing platform, compared to that of the partitioning methods based on the original row order of a matrix.
Keywords :
graphics processing units; matrix multiplication; multiprocessing systems; parallel algorithms; sparse matrices; statistical distributions; CPU-GPU hybrid parallel computing model; balanced load distribution; heterogeneous computing platform; heterogeneous processors; multicore CPU; partitioned SpMV; probability distribution; probability mass function; sparse matrix partitioning strategy; sparse matrix-vector multiplication; Computational modeling; Multicore processing; Partitioning algorithms; Program processors; Sparse matrices; Vectors; GPU; matrix partition; multicore CPU; probability distribution; sparse matrix-vector multiplication;
Journal_Title :
Computers, IEEE Transactions on
DOI :
10.1109/TC.2014.2366731