DocumentCode :
1554960
Title :
Systematic design of fault-tolerant multiprocessors with shared buses
Author :
Ku, Hung-Kuei ; Hayes, John P.
Author_Institution :
AT&T Bell Labs., Middletown, NJ, USA
Volume :
46
Issue :
4
fYear :
1997
fDate :
4/1/1997 12:00:00 AM
Firstpage :
439
Lastpage :
455
Abstract :
A multiprocessor system is fault-tolerant (FT) if it preserves a fault-free subsystem of a predetermined interconnection structure when faults appear. We present a new method for designing FT multiprocessors that can efficiently tolerate both processor and interconnection faults. The approach is general, in that it can be applied to any multiprocessor topology. Shared buses serve as the main interconnection mechanism to minimize the switching logic needed for reconfiguration. We employ processor-bus-link (PBL) graphs to model multiprocessors with either dedicated or shared buses. Both processors and buses are represented as nodes so that bus faults can be considered explicitly and tolerated efficiently by spare buses instead of by spare processors. A minimum number of spare processors and buses are used to reduce hardware overhead. The node covering concept and the maximum-weight spanning tree algorithm are then employed to construct FT systems that have lower interconnection cost than most previous designs. We also present a cost-effective implementation method which is suitable for both static and dynamic reconfiguration techniques. The FT systems obtained have the advantages of no critical single point of failure, low redundancy, local replacement, and simple circuitry for fast reconfiguration
Keywords :
fault tolerant computing; multiprocessing systems; bus faults; fast reconfiguration; fault-free subsystem; fault-tolerant multiprocessors; interconnection faults; local replacement; low redundancy; maximum-weight spanning tree algorithm; processor-bus-link graphs; shared buses; switching logic; systematic design; Algorithm design and analysis; Circuit faults; Costs; Design methodology; Fault tolerant systems; Hardware; Integrated circuit interconnections; Multiprocessing systems; Reconfigurable logic; Topology;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/12.588058
Filename :
588058
Link To Document :
بازگشت