A Fine-grained Pipelined Implementation of the LINPACK Benchmark on FPGAs

Author

Wu, Guiming ; Dou, Yong ; Lei, Yuanwu ; Zhou, Jie ; Wang, Miao ; Jiang, Jingfei

Author_Institution

Nat. Lab. for Parallel & Distrib. Process., NUDT, Changsha, China

fYear

2009

fDate

5-7 April 2009

Firstpage

183

Lastpage

190

Abstract

Previous works have projected that the peak performance of FPGAs can outperform that of the general purpose processors. However, no work actually compares the performance between FPGAs and CPUs using the standard benchmarks such as the LINPACK benchmark. We propose and implement an FPGA-based hardware design of the LINPACK benchmark, the key step of which is LU decomposition with pivoting. We introduce a fine-grained pipelined LU decomposition algorithm that enables optimum performance by exploiting fine-grained pipeline parallelism. A scalable linear array of processing elements (PEs), which is the core component of our hardware design, is proposed to implement this algorithm. To the best of our knowledge, this is the first reported FPGA-based pipelined implementation of LU decomposition with pivoting. A total of 19 PEs can be integrated into an Altera Stratix II EP2S130F1020C5 on our self-designed development board. Experimental results show that the speedup up to 6.14 can be achieved relative to a Pentium 4 processor for the LINPACK benchmark.

Keywords

benchmark testing; field programmable gate arrays; matrix decomposition; pipeline processing; Altera Stratix II EP2S130F1020C5; FPGA-based hardware design; LINPACK benchmark; LU decomposition; Pentium 4 processor; fine-grained pipeline parallelism; fine-grained pipelined implementation; general purpose processors; processing elements; scalable linear array; Algorithm design and analysis; Concurrent computing; Distributed computing; Distributed processing; Equations; Field programmable gate arrays; Hardware; Laboratories; Parallel processing; Pipelines;

fLanguage

English

Publisher

ieee

Conference_Titel

Field Programmable Custom Computing Machines, 2009. FCCM '09. 17th IEEE Symposium on

Conference_Location

Napa, CA

Print_ISBN

978-0-7695-3716-0

Type

conf

DOI

10.1109/FCCM.2009.11

Filename

5290929