• DocumentCode
    1830667
  • Title

    An Efficient Sparse Matrix Multiplication for Skewed Matrix on GPU

  • Author

    Shah, Mubarak ; Patel, Vaibhav

  • fYear
    2012
  • fDate
    25-27 June 2012
  • Firstpage
    1301
  • Lastpage
    1306
  • Abstract
    This paper presents a new sparse matrix format ALIGNED_COO, an extension to COO format to optimize performance of large sparse matrix having skewed distribution of non-zero elements. Load balancing, alignment and synchronization free distribution of work load are three important factors to improve performance of sparse matrices representing power-law graph. Coordinate (COO) format is selected for extension in this paper as it is the most suitable format for sparse matrices representing power-law graph. The ALIGNED_COO format tries to set maximum alignment across the computing resources. Our heuristic to decide degree of concurrency is different from the existing approaches. Despite the availability of other popular sparse formats, ALIGNED_COO format helps to gain better performance without any extra memory overhead. Our approach not only achieves higher performance on skewed matrices with power-law distribution, but also gives appreciable performance for wide range of sparse matrix patterns. The proposed implementation of SpMV kernel for ALIGNED_COO sparse format helps to achieve 1.0-25.72 times higher performance than COO_flat kernel with increase in the level of accuracy. The average performance gain over other sparse formats is in tolerable range of 0.89-48.8.
  • Keywords
    concurrency control; graphics processing units; mathematics computing; matrix multiplication; parallel programming; performance evaluation; resource allocation; sparse matrices; vectors; ALIGNED_COO sparse format; GPU; SpMV kernel; computing resources; coordinate format; large sparse matrix performance optimization; load balancing; performance gain; performance improvement; power-law distribution; power-law graph; skewed matrix; sparse matrix format; sparse matrix patterns; sparse matrix vector multiplication; synchronization free distribution; work load alignment; Graphics processing unit; Indexes; Instruction sets; Kernel; Load management; Sparse matrices; Vectors; Load balance; Power-law graph; SpMV;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
  • Conference_Location
    Liverpool
  • Print_ISBN
    978-1-4673-2164-8
  • Type

    conf

  • DOI
    10.1109/HPCC.2012.192
  • Filename
    6332328