Title :
Adaptive fault recovery for networked reconfigurable systems
Author :
Xu, Weifeng ; Ramanarayanan, Ramshankar ; Tessier, Russell
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Massachusetts, Amherst, MA, USA
Abstract :
The device-level size and complexity of reconfigurable architectures makes fault tolerance an important concern in system design. In this paper, we introduce a fully automated fault recovery system for networked systems, which contain FPGAs (field programmable gate arrays). If a fault is detected hat cannot be addressed locally, fault information is transferred to a reconfiguration server. Following design recompilation to avoid the fault, a new FPGA configuration is returned to the remote system and computation is reinitiated. To illustrate the benefit of this approach, we have implemented a complete fault recovery system, which requires no manual intervention. An important part of the system is a timing-driven incremental router for Xilinx Virtex devices. This router is directly interfaced to Xilinx JBits and uses no CAD tools from the standard Xilinx Alliance tool flow. Our completed system has been applied to three benchmark designs and exhibits complete fault recovery in up to 12x less time than the standard incremental Xilinx PAR flow.
Keywords :
adaptive systems; computer networks; fault tolerant computing; field programmable gate arrays; reconfigurable architectures; FPGA; Xilinx Alliance; Xilinx JBits; Xilinx Virtex device; adaptive fault recovery; fault tolerance; field programmable gate array; networked systems; reconfigurable architecture; timing-driven incremental router; Computer networks; Design automation; Fault detection; Fault tolerant systems; Field programmable gate arrays; Network servers; Reconfigurable logic; Redundancy; Routing; Table lookup;
Conference_Titel :
Field-Programmable Custom Computing Machines, 2003. FCCM 2003. 11th Annual IEEE Symposium on
Print_ISBN :
0-7695-1979-2
DOI :
10.1109/FPGA.2003.1227250