مرکز منطقه ای اطلاع رساني علوم و فناوري - High Performance Sparse LU Solver FPGA Accelerator Using a Static Synchronous Data Flow Model

DocumentCode :

3181724

Title :

High Performance Sparse LU Solver FPGA Accelerator Using a Static Synchronous Data Flow Model

Author :

Hassan, Mohamed W. ; Helal, Ahmed E. ; Hanafy, Yasser Y.

Author_Institution :

Electr. & Comput. Eng., Virginia Tech, Blacksburg, VA, USA

fYear :

2015

fDate :

2-6 May 2015

Firstpage :

Lastpage :

Abstract :

Sparse LU solvers are common in several scientific problems. The hardware utilization of previous implementations on massively parallel platforms never exceeded the 20% mark (including multicores, GPU, and FPGA). This is due to the highly irregular computation and memory access pattern of the algorithm. Reconfigurable fabrics, with its spatial execution model, can expose the maximum inherent parallelism in the problem and achieve the highest hardware utilization. However, dynamic data flow models implementations suffer from large overhead and scalability issues. In this paper, we propose a static dataflow synchronous model that maximizes the utilization of FPGA-based architectures. Synchronous dataflow graph is mapped to a mesh of deeply-pipelined PEs to perform the factorization. This inspires the development of a customized data structure format that reduces memory accesses, indexing overhead and pipelining hazards. The hardware model is synthesized on a VIRTEX 7 FPGA and the results show a hardware utilization exceeding 60%, which was translated to more than 100 GFLOPS.

Keywords :

data flow graphs; data structures; field programmable gate arrays; parallel algorithms; reconfigurable architectures; FPGA-based architectures; VIRTEX 7 FPGA; customized data structure format; dynamic data flow models; hardware model; high performance sparse LU solver FPGA accelerator; indexing overhead; massively parallel platforms; memory access pattern; pipelining hazards; reconfigurable fabrics; static synchronous data flow model; synchronous dataflow graph; Circuit simulation; Field programmable gate arrays; Hardware; Matrix decomposition; Multicore processing; Sparse matrices; Symmetric matrices; Deep pipeline; FPGA; Reconfigurable hardware; SPICE; Scheduling; Sparse LU; Static Synchronous Data-flow;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Field-Programmable Custom Computing Machines (FCCM), 2015 IEEE 23rd Annual International Symposium on

Conference_Location :

Vancouver, BC

Type :

conf

DOI :

10.1109/FCCM.2015.21

Filename :

7160030

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3181724