• DocumentCode
    625593
  • Title

    Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication

  • Author

    Demmel, J. ; Eliahu, David ; Fox, A. ; Kamil, Shoaib ; Lipshitz, Benjamin ; Schwartz, Ofer ; Spillinger, Omer

  • Author_Institution
    Math. Dept., UC Berkeley, Berkeley, CA, USA
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    261
  • Lastpage
    272
  • Abstract
    Communication-optimal algorithms are known for square matrix multiplication. Here, we obtain the first communication-optimal algorithm for all dimensions of rectangular matrices. Combining the dimension-splitting technique of Frigo, Leiserson, Prokop and Ramachandran (1999) with the recursive BFS/DFS approach of Ballard, Demmel, Holtz, Lipshitz and Schwartz (2012) allows for a communication-optimal as well as cache and network-oblivious algorithm. Moreover, the implementation is simple: approximately 50 lines of code for the shared-memory version. Since the new algorithm minimizes communication across the network, between NUMA domains, and between levels of cache, it performs well in practice on both shared and distributed-memory machines. We show significant speedups over existing parallel linear algebra libraries both on a 32-core shared-memory machine and on a distributed-memory supercomputer.
  • Keywords
    cache storage; digital arithmetic; distributed shared memory systems; matrix multiplication; parallel algorithms; NUMA domains; cache-network-oblivious algorithm; communication-optimal algorithms; communication-optimal parallel recursive rectangular matrix multiplication; dimension-splitting technique; distributed-memory machines; distributed-memory supercomputer; parallel linear algebra libraries; recursive BFS-DFS approach; shared memory machines; square matrix multiplication; Algorithm design and analysis; Bandwidth; Layout; Libraries; Linear algebra; Memory management; Program processors; linear algebra; matrix multiplication; ommunication-avoiding algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on
  • Conference_Location
    Boston, MA
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-6066-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2013.80
  • Filename
    6569817