Title :
Identifying the cause of detected errors
Author_Institution :
Allied-Signal Aerosp. Co., Columbia, MD, USA
Abstract :
The author presents an approach to the consistent diagnosis of error monitoring observations in a distributed fault-tolerant computing system, even when the faulty source produces arbitrary errors. He describes the online algorithm used in the multicomputer architecture for fault tolerance (MAFT) to diagnose faulty system elements. By the use of syndrome information which categorizes detected errors as either symmetric or asymmetric, bounds for correct diagnosis can be deduced. Finally, an interactive consistency algorithm is employed to guarantee consistent diagnosis in a distributed environment and to provide online verification of all diagnostic units.<>
Keywords :
computer architecture; distributed processing; fault tolerant computing; arbitrary errors; consistent diagnosis; diagnostic units; distributed fault-tolerant computing system; error monitoring observations; faulty source; interactive consistency algorithm; multicomputer architecture for fault tolerance; online algorithm; online verification; Aerodynamics; Distributed computing; Fault detection; Fault diagnosis; Fault tolerant systems; Hardware; Monitoring; Redundancy; Testing; Working environment noise;
Conference_Titel :
Fault-Tolerant Computing, 1990. FTCS-20. Digest of Papers., 20th International Symposium
Conference_Location :
Newcastle Upon Tyne, UK
Print_ISBN :
0-8186-2051-X
DOI :
10.1109/FTCS.1990.89365