Title :
Implementation of online distributed system-level diagnosis theory
Author :
Bianchini, Ronald P., Jr. ; Buskens, Richard W.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
fDate :
5/1/1992 12:00:00 AM
Abstract :
The practical application and implementation of online distributed system-level diagnosis theory is documented. Proven distributed diagnosis algorithms are shown to be impractical in real systems due to high resource requirements. A distributed system-level diagnosis algorithm called Adaptive DSD is shown to minimize network resources and has resulted in a practical implementation. Adaptive DSD assumes a distributed network, in which network nodes can test other nodes and determine them to be faulty or fault-free. Tests are issued from each node adaptively and depend on the fault situation of the network. Test result reports are generated from test results and forwarded between nodes in the network. Adaptive DSD is proven correct in that each fault-free node reaches an accurate independent diagnosis of the fault conditions of the remaining nodes. No restriction is placed on the number of faulty nodes; any fault situation with any number of faulty nodes is diagnosed correctly. An implementation of the Adaptive DSD algorithm is described
Keywords :
distributed processing; fault tolerant computing; multiprocessor interconnection networks; parallel algorithms; Adaptive DSD; distributed diagnosis algorithms; distributed network; fault conditions; fault situation; fault-free node; faulty; minimize; network nodes; network resources; online distributed system-level diagnosis theory; real systems; Adaptive systems; Application software; Computer displays; Computer networks; Fault diagnosis; Performance evaluation; Testing; Workstations;
Journal_Title :
Computers, IEEE Transactions on