• DocumentCode
    3643418
  • Title

    Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication

  • Author

    Aydin Buluç;Samuel Williams;Leonid Oliker;James Demmel

  • Author_Institution
    Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
  • fYear
    2011
  • fDate
    5/1/2011 12:00:00 AM
  • Firstpage
    721
  • Lastpage
    733
  • Abstract
    On multicore architectures, the ratio of peak memory bandwidth to peak floating-point performance (byte:flop ratio) is decreasing as core counts increase, further limiting the performance of bandwidth limited applications. Multiplying a sparse matrix (as well as its transpose in the unsymmetric case) with a dense vector is the core of sparse iterative methods. In this paper, we present a new multithreaded algorithm for the symmetric case which potentially cuts the bandwidth requirements in half while exposing lots of parallelism in practice. We also give a new data structure transformation, called bit masked register blocks, which promises significant reductions on bandwidth requirements by reducing the number of indexing elements without introducing additional fill-in zeros. Our work shows how to incorporate this transformation into existing parallel algorithms (both symmetric and unsymmetric) without limiting their parallel scalability. Experimental results indicate that the combined benefits of bit masked register blocks and the new symmetric algorithm can be as high as a factor of 3.5× in multicore performance over an already scalable parallel approach. We also provide a model that accurately predicts the performance of the new methods, showing that even larger performance gains are expected in future multicore systems as current trends (decreasing byte:flop ratio and larger sparse matrices) continue.
  • Keywords
    "Bandwidth","Symmetric matrices","Sparse matrices","Parallel processing","Registers","Multicore processing","Program processors"
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-61284-372-8
  • Type

    conf

  • DOI
    10.1109/IPDPS.2011.73
  • Filename
    6012883