Title :
Hardware-efficient and highly-reconfigurable 4- and 2-track fault-tolerant designs for mesh-connected multicomputers
Author :
Mahapatra, Nihar R. ; Dutt, Shantanu
Author_Institution :
Dept. of Electr. Eng., Minnesota Univ., Minneapolis, MN, USA
Abstract :
We consider m-track models for constructing fault-tolerant (FT) mesh systems which have one primary and m spare tracks per row and column, switches at the intersection of these tracks, and spare processors at the boundaries. A faulty system is reconfigured by finding for each fault u a reconfiguration path from the fault to a spare in which starting from the fault u, a processor is replaced or “covered” by the nearest “available” succeeding processor on the path-a processor on the path is not available if it is faulty or is used as a “cover” on some other reconfiguration path. In previous work, a 1-track design that can support any set of node-disjoint straight reconfiguration paths, and a more reliable 3-track design that can support any set of node-disjoint rectilinear reconfiguration paths have been proposed. In this paper; we present: (1) A fundamental result regarding the universality of simple “one-to-one switches” in m-track 2-D mesh designs in terms of their reconfigurabilities. (2) A 4-track mesh design that can support any set of edge-disjoint (a much less restrictive criterion than node-disjointness) rectilinear reconfiguration paths, and that has 34% less switching overhead and significantly higher actually close-to-optimal, reconfigurability compared to the previously proposed 3-track design. (3) A new 2-track design derived from the above 4-track design that we show can support the same set of reconfiguration paths as the preview 3-track design but with 33% less wiring overhead. (4) Results on the deterministic fault tolerance capabilities (the number of faults guaranteed reconfigurable) of our 4- and 2-track designs, and the previously proposed 1- and 3-track designs
Keywords :
computer network reliability; fault tolerant computing; multiprocessor interconnection networks; parallel architectures; reconfigurable architectures; reliability; deterministic fault tolerance capabilities; esh-connected multicomputers; fault-tolerant designs; node-disjoint rectilinear reconfiguration paths; node-disjoint straight reconfiguration paths; reconfigurability; spare processors; switching overhead; Design methodology; Fault tolerance; Fault tolerant systems; Head; Parallel machines; Process design; Semiconductor device modeling; Switches; Tail; Wiring;
Conference_Titel :
Fault Tolerant Computing, 1996., Proceedings of Annual Symposium on
Conference_Location :
Sendai
Print_ISBN :
0-8186-7262-5
DOI :
10.1109/FTCS.1996.535880