DocumentCode
1244039
Title
A distributed system-level diagnosis algorithm for arbitrary network topologies
Author
Rangarajan, Sampath ; Dahbura, Anton T. ; Ziegler, Eric A.
Author_Institution
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
Volume
44
Issue
2
fYear
1995
fDate
2/1/1995 12:00:00 AM
Firstpage
312
Lastpage
334
Abstract
A distributed algorithm is described for detecting and diagnosing faulty processors in an arbitrary network. Fault free processors perform simple periodic tests on one another; when a fault is detected or a newly repaired processor joins the network, this new information is disseminated in parallel throughout the network. It is formally proven that the algorithm is correct, and it is also shown that the algorithm is optimal in terms of the time required for all of the fault free processors in the network to learn of a new event. Simulation results are given for arbitrary network topologies
Keywords
computer debugging; distributed algorithms; fault tolerant computing; program verification; reliability; algorithm correctness; arbitrary network topologies; distributed system-level diagnosis algorithm; fault free processors; faulty processors; periodic tests; Computer networks; Distributed algorithms; Distributed computing; Fault detection; Fault diagnosis; Military computing; Network topology; Performance evaluation; System testing; Workstations;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/12.364542
Filename
364542
Link To Document