Title :
REPAIR: A Reliable Partial-Redundancy-Based Router in NoC
Author :
Lei Xie ; Kuizhi Mei ; Yuhai Li
Author_Institution :
Inst. of Artificial Intell. & Robot., Xi´an Jiaotong Univ., Xian, China
Abstract :
As scale and integration density of network-on-chip increase sharply, more transistors have been integrated into one chip. This unfortunately leads to more unexpected variations and faults in system. In particular, the transient errors and hardware permanent faults have rapidly become the key constraint for large-scale network design. This increasing tendency highlights the incorporation of fault-tolerant solutions for Network-on-Chip (NoC) architecture. In this paper we propose a Reliable Partial-Redundancy-based router architecture (REPAIR). The proposed scheme merely utilizes an additional buffer and a bus to enhance the connectivity of the data path in router. Meanwhile, REPAIR also employs error control coding (ECC) modules and decision-table-based (DT) control logic to implement an efficient online diagnosis and reconfigurable mechanism respectively. The experimental results show the good ability of REPAIR to tolerate hard faults under a high fault rates. Specifically, the silicon protection factor (SPF) of individual router reaches 16.34 and over 95% packets still can be successfully transferred in 16x16 torus network with 650 faults.
Keywords :
decision tables; error correction codes; network routing; network-on-chip; protection; silicon; NoC; REPAIR; buffer; bus; data path; decision-table-based control logic; error control coding modules; fault rates; network-on-chip architecture; online diagnosis; reconfigurable mechanism; reliable partial-redundancy-based router; silicon protection factor; Error correction codes; Fault tolerance; Fault tolerant systems; Maintenance engineering; Ports (Computers); Routing;
Conference_Titel :
Networking, Architecture and Storage (NAS), 2013 IEEE Eighth International Conference on
Conference_Location :
Xi´an
DOI :
10.1109/NAS.2013.28