Title :
A distributed algorithm for fault diagnosis in systems with soft failures
Author :
Yang, Che-Liang ; Masson, Gerald M.
Author_Institution :
GTE Labs. Inc., Waltham, MA, USA
fDate :
11/1/1988 12:00:00 AM
Abstract :
The problem of diagnosis of soft failures at the system level in large and fully distributed networks of processors (or units) is considered. A system model in which each of the network´s units is assumed to possess the ability to test (or evaluate) certain other units for the presence of failures is employed. Using this model and assuming that the total number of faulty units does not exceed a given bound, a distributed algorithm is presented which allows all the fault-free units to independently converge to correct and consistent diagnoses of the system status. This algorithm is also shown to be applicable to bounded fault situations where both units and communication links can be faulty
Keywords :
distributed processing; failure analysis; fault location; fault tolerant computing; bounded fault situations; consistent diagnoses; distributed algorithm; fault diagnosis; fault-free units; faulty units; fully distributed networks; soft failures; system model; system status; Computer errors; Digital arithmetic; Digital filters; Digital signal processing; Distributed algorithms; Electrons; Fault diagnosis; Redundancy; Testing; Very large scale integration;
Journal_Title :
Computers, IEEE Transactions on