DocumentCode
3183238
Title
A Sparse Matrix Personality for the Convey HC-1
Author
Nagar, Krishna K. ; Bakos, Jason D.
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of South Carolina, Columbia, SC, USA
fYear
2011
fDate
1-3 May 2011
Firstpage
1
Lastpage
8
Abstract
In this paper we describe a double precision floating point sparse matrix-vector multiplier (SpMV) and its performance as implemented on a Convey HC-1 reconfigurable computer. The primary contributions of this work are a novel streaming reduction architecture for floating point accumulation, a novel on-chip cache optimized for streaming compressed sparse row (CSR) matrices, and end-to-end integration with the HC-1´s system, programming model, and runtime environment. The design is composed of 32 parallel processing elements, each connected to the HC-1´s coprocessor memory and each containing a streaming multiply-accumulator and local vector cache. When used on the HC-1, each PE has a peak throughput of 300 double precision MFLOP/s, giving a total peak throughput of 9.6 GFLOPS/s. For our test matrices, we demonstrate up to 40% of the peak performance and compare these results with results obtained using the CUSparse library on an NVIDIA Tesla S1070 GPU. In most cases our implementation exceeds the performance of the GPU.
Keywords
cache storage; coprocessors; floating point arithmetic; matrix multiplication; multiplying circuits; parallel processing; reconfigurable architectures; sparse matrices; CUSparse library; Convey HC-1 reconfigurable computer; HC-1 coprocessor memory; HC-1 system; compressed sparse row matrix; double precision floating point sparse matrix-vector multiplier; end-to-end integration; floating point accumulation; local vector cache; multiply-accumulator; on-chip cache; parallel processing; programming model; runtime environment; sparse matrix personality; streaming reduction architecture; Adders; Arrays; Coprocessors; Field programmable gate arrays; Pipelines; Sparse matrices; SpMV; floating point accumulation; reconfigurable computing; reduction; sparse matrix;
fLanguage
English
Publisher
ieee
Conference_Titel
Field-Programmable Custom Computing Machines (FCCM), 2011 IEEE 19th Annual International Symposium on
Conference_Location
Salt Lake City, UT
Print_ISBN
978-1-61284-277-6
Electronic_ISBN
978-0-7695-4301-7
Type
conf
DOI
10.1109/FCCM.2011.60
Filename
5771239
Link To Document