• DocumentCode
    2365779
  • Title

    FPGA Based High Performance Double-Precision Matrix Multiplication

  • Author

    Kumar, Vinay BY ; Joshi, Siddharth ; Patkar, Sachin B. ; Narayanan, H.

  • Author_Institution
    Dept. of Electr. Eng., Indian Inst. of Technol., Mumbai
  • fYear
    2009
  • fDate
    5-9 Jan. 2009
  • Firstpage
    341
  • Lastpage
    346
  • Abstract
    We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplication, an important kernel in many tile-based BLAS algorithms, optimized for implementation on high-end FPGAs. The designs, both based on the rank-1 update scheme, can handle arbitrary matrix sizes, and are able to sustain their peak performance except during an initial latency period. Through these designs, the trade-offs involved in terms of local-memory and bandwidth for an FPGA implementation are demonstrated and an analysis is presented for the optimal choice of design parameters. The designs, implemented on a Virtex-5 SX240T FPGA, scale gracefully from 1 to 40 processing elements(PEs) with a less than 1% degradation in the design frequency of 373 MHz. With 40 PEs and a design speed of 373 MHz, a sustained performance of 29.8 GFLOPS is possible with a bandwidth requirement of 750 MB/s for design-II and 5.9 GB/s for design-I.
  • Keywords
    field programmable gate arrays; matrix algebra; FPGA implementation; GFLOPS; arbitrary matrix sizes; bit rate 5.9 Gbit/s; bit rate 750 Mbit/s; double precision floating point matrix multiplication; field programmable gate arrays; frequency 373 MHz; tile-based BLAS algorithms; Acceleration; Algorithm design and analysis; Bandwidth; Degradation; Delay; Design optimization; Field programmable gate arrays; Hardware; Kernel; Very large scale integration; FPGA based HPC; Matrix Matrix Multiply; Performance-Bandwidth;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    VLSI Design, 2009 22nd International Conference on
  • Conference_Location
    New Delhi
  • ISSN
    1063-9667
  • Print_ISBN
    978-0-7695-3506-7
  • Type

    conf

  • DOI
    10.1109/VLSI.Design.2009.13
  • Filename
    4749697