Timing aware partitioning for multi-FPGA based logic simulation using top-down selective hierarchy flattening

Author

Swaminathan, Subramanian Poothamkurissi ; Lin, Pey-Chang Kent ; Khatri, Sunil P.

Author_Institution

Dept. of ECE, Texas A&M Univ., College Station, TX, USA

fYear

2012

fDate

Sept. 30 2012-Oct. 3 2012

Firstpage

153

Lastpage

158

Abstract

In order to accelerate logic simulation, it is highly beneficial to simulate the circuit design on FPGA hardware. This is often referred to as emulation, and we use the terms simulation and emulation interchangeably in this paper. However, limited hardware on FPGAs prevents large designs from being implemented on a single FPGA. Hence there is a need to partition the design and simulate it on a multi-FPGA platform. In contrast to existing FPGA-based post-synthesis partitioning approaches which first completely flatten the circuit and then possibly perform bottom-up clustering, we perform a selective top-down flattening and thereby avoid the potential netlist blowup. This also allows us to preserve the design hierarchy to guide the partitioning and to make subsequent debugging easier. Our approach analyzes the hierarchical design and selectively flattens instances using two metrics based on slack. The resulting partially flattened netlist is converted to a hypergraph, partitioned using hMetis, and reconverted back to a plurality of FPGA netlists, one for each FPGA of the FPGA-based accelerated logic simulation platform. We compare our approach with a partitioning approach that operates on a completely flattened netlist. Static timing analysis was performed for both approaches, and over 15 large examples from the OpenCores project, our approach yields a 52% logic simulation speedup and about 0.74× runtime for the entire flow, compared to the completely flat approach. The entire tool chain of our approach is automated in an end-to-end flow from hierarchy extraction, selective flattening, partitioning, and netlist reconstruction. Compared to an existing method which also performs slack-based partitioning of a hierarchical netlist, we obtain a 35% simulation speedup. Our method scales very well, yielding a significantly better simulation speedup and runtime improvement for larger examples.

Keywords

field programmable gate arrays; logic simulation; FPGA based accelerated logic simulation platform; FPGA based post synthesis partitioning; FPGA hardware; FPGA netlist; OpenCores project; bottom up clustering; entire tool chain; hierarchical netlist; hierarchy extraction; hypergraph; logic simulation speedup; multiFPGA based logic simulation; multiFPGA platform; netlist reconstruction; runtime improvement; selective flattening; slack based partitioning; static timing analysis; timing aware partitioning; top down selective hierarchy flattening; Algorithm design and analysis; Delay; Field programmable gate arrays; Mathematical model; Partitioning algorithms; Runtime;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Design (ICCD), 2012 IEEE 30th International Conference on

Conference_Location

Montreal, QC

ISSN

1063-6404

Print_ISBN

978-1-4673-3051-0

Type

conf

DOI

10.1109/ICCD.2012.6378634

Filename

6378634