DocumentCode :
1159935
Title :
Design of algorithm-based fault-tolerant multiprocessor systems for concurrent error detection and fault diagnosis
Author :
Vinnakota, Bapiraju ; Jha, Niraj K.
Author_Institution :
Dept. of Electr. Eng., Minnesota Univ., Minneapolis, MN, USA
Volume :
5
Issue :
10
fYear :
1994
fDate :
10/1/1994 12:00:00 AM
Firstpage :
1099
Lastpage :
1106
Abstract :
Algorithm-based fault tolerance (ABPT) is a low-overhead system-level concurrent error detection and fault location scheme for multiprocessor systems. We present new methods for the design of ABFT systems. Our design procedure is applicable to a wide range of systems in which processors share data elements. A feature of our design approach is that the type of checks to be used in the final system can be controlled by the system designer. We also present some new bounds on the number of checks needed in ABFT system design
Keywords :
fault location; fault tolerant computing; multiprocessing systems; parallel architectures; reliability; system recovery; ABFT system design; ABFT systems; algorithm-based fault tolerance; algorithm-based multiprocessor systems; concurrent error detection; data element sharing; design procedure; fault diagnosis; fault location scheme; fault-tolerant multiprocessor systems; low-overhead system-level error detection; Algorithm design and analysis; Control systems; Design methodology; Fault detection; Fault diagnosis; Fault location; Fault tolerance; Fault tolerant systems; Multiprocessing systems; Signal processing algorithms;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/71.313125
Filename :
313125
Link To Document :
بازگشت