Title :
Reconfigurable sparse/dense matrix-vector multiplier
Author :
Kuzmanov, Georgi ; Taouil, Mottaqiallah
Author_Institution :
Comput. Eng. Lab., Delft Univ. of Technol., Delft, Netherlands
Abstract :
We propose an ANSI/IEEE-754 double precision floating-point matrix-vector multiplier. Its main feature is the capability to process efficiently both dense matrix-vector multiplications (DMVM) and sparse matrix-vector multiplications (SMVM). The design is composed of multiple processing elements (PE) and is optimized for FPGAs. We investigate theoretically the boundary conditions when the DMVM equals the SMVM performance with respect to the matrix sparsity. Thus, we can determine the most efficient processing mode configuration with respect to the input data sparsity. Furthermore, we evaluate our design both with simulations and on real hardware. We experimented on an Altix 450 machine using the SGI reconfigurable application specific computing (RASC) services, which couple dual-core Itanium-2 processors with Virtex-4 LX200 FPGAs. Our design has been routed and executed on the Altix 450 machine at 100 MHz. Experimental results suggest that only two PEs suffice to outperform the pure software SMVM execution. The performance improvement at the kernel level scales near linearly to the number of configured PEs both for the SMVM and DMVM. Compared to related work, the design does not indicate any performance degradation and performs equally or better than designs optimized either for SMVM or DMVM alone.
Keywords :
field programmable gate arrays; floating point arithmetic; reconfigurable architectures; ANSI/IEEE-754 double precision floating-point matrix-vector multiplier; Altix 450 machine; SGI reconfigurable application specific computing; Virtex-4 LX200 FPGA; dense matrix-vector multiplier; dual-core Itanium-2 processors; matrix sparsity; processing elements; reconfigurable sparse matrix-vector multiplier; Application software; Boundary conditions; Computational modeling; Computer applications; Degradation; Design optimization; Field programmable gate arrays; Hardware; Kernel; Sparse matrices;
Conference_Titel :
Field-Programmable Technology, 2009. FPT 2009. International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-4375-8
Electronic_ISBN :
978-1-4244-4377-2
DOI :
10.1109/FPT.2009.5377625