مرکز منطقه ای اطلاع رساني علوم و فناوري - Embedded supercomputing in FPGAs with the VectorBlox MXP Matrix Processor

DocumentCode :

2161745

Title :

Embedded supercomputing in FPGAs with the VectorBlox MXP Matrix Processor

Author :

Severance, Aaron ; Lemieux, Guy G. F.

Author_Institution :

Univ. of British Columbia, Vancouver, BC, Canada

fYear :

2013

fDate :

Sept. 29 2013-Oct. 4 2013

Firstpage :

Lastpage :

Abstract :

Embedded systems frequently use FPGAs to perform highly parallel data processing tasks. However, building such a system usually requires specialized hardware design skills with VHDL or Verilog. Instead, this paper presents the VectorBlox MXP Matrix Processor, an FPGA-based soft processor capable of highly parallel execution. Programmed entirely in C, the MXP is capable of executing data-parallel software algorithms at hardware-like speeds. For example, the MXP running at 200MHz or higher can implement a multi-tap FIR filter and output 1 element per clock cycle. MXP´s parameterized design lets the user specify the amount of parallelism required, ranging from 1 to 128 or more parallel ALUs. Key features of the MXP include a parallel-access scratchpad memory to hold vector data and high-throughput DMA and scatter/gather engines. To provide extreme performance, the processor is expandable with custom vector instructions and custom DMA filters. Finally, the MXP seamlessly ties into existing Altera and Xilinx development flows, simplifying system creation and deployment.

Keywords :

C language; FIR filters; embedded systems; field programmable gate arrays; formal specification; hardware description languages; instruction sets; logic design; microprocessor chips; parallel algorithms; parallel machines; Altera development flow; C programming; FPGA-based soft processor; MXP parameterized design; VHDL; VectorBlox MXP matrix processor; Verilog; Xilinx development flow; custom DMA filters; custom vector instructions; data-parallel software algorithm execution; embedded supercomputing; embedded systems; gather engine; hardware design; hardware-like speed; high-throughput DMA; highly parallel data processing task; highly parallel execution; multitap FIR filter; parallel ALU; parallel-access scratchpad memory; parallelism amount specification; scatter engine; system creation; system deployment; vector data; Clocks; Engines; Field programmable gate arrays; Finite impulse response filters; Hardware; Registers; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2013 International Conference on

Conference_Location :

Montreal, QC

Type :

conf

DOI :

10.1109/CODES-ISSS.2013.6658993

Filename :

6658993

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2161745