• DocumentCode
    55076
  • Title

    Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs

  • Author

    Wangdong Yang ; Kenli Li ; Zeyao Mo ; Keqin Li

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
  • Volume
    64
  • Issue
    9
  • fYear
    2015
  • fDate
    Sept. 1 2015
  • Firstpage
    2623
  • Lastpage
    2636
  • Abstract
    This paper presents a sparse matrix partitioning strategy to improve the performance of SpMV on GPUs and multicore CPUs. This method has wide adaptability for different types of sparse matrices, and is different from existing methods which only adapt to some particular sparse matrices. In addition, our partitioning method can obtain dense blocks by analyzing the probability distribution of non-zero elements in a sparse matrix, and result in very low proportion of zero padded. We make the following significant contributions. (1) We present a partitioning strategy of sparse matrices based on probabilistic modeling of non-zero elements in a row. (2) We prove that our method has the highest mean density compared with other strategies according to certain given ratios of partition obtained from the computing powers of heterogeneous processors. (3) We develop a CPU-GPU hybrid parallel computing model for SpMV on GPUs and multicore CPUs in a heterogeneous computing platform. Our partitioning strategy has balanced load distribution and the performance of SpMV is significantly improved when a sparse matrix is partitioned into dense blocks using our method. The average performance improvement of our solution for SpMV is about 15.75 percent on multicore CPUs, compared to that of the other solutions. By considering the rows of a matrix in a unique order based on the probability mass function of the number of non-zeros in a row, the average performance improvement of our solution for SpMV is about 33.52 percent on GPUs and multicore CPUs of a heterogeneous computing platform, compared to that of the partitioning methods based on the original row order of a matrix.
  • Keywords
    graphics processing units; matrix multiplication; multiprocessing systems; parallel algorithms; sparse matrices; statistical distributions; CPU-GPU hybrid parallel computing model; balanced load distribution; heterogeneous computing platform; heterogeneous processors; multicore CPU; partitioned SpMV; probability distribution; probability mass function; sparse matrix partitioning strategy; sparse matrix-vector multiplication; Computational modeling; Multicore processing; Partitioning algorithms; Program processors; Sparse matrices; Vectors; GPU; matrix partition; multicore CPU; probability distribution; sparse matrix-vector multiplication;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2014.2366731
  • Filename
    6965639