DocumentCode
2199009
Title
Error detection mechanisms for massively parallel multiprocessors
Author
Cin, M. Dal ; Hohl, W. ; Michel, E. ; Pataricza, A.
Author_Institution
Math. Inst., Erlangen-Nurnberg Univ., Germany
fYear
1993
fDate
27-29 Jan 1993
Firstpage
401
Lastpage
408
Abstract
A survey on the most important methods for error detection in multiprocessor systems is presented. A detailed comparison between watchdog processor and master-checker based fault tolerance is given. The fault coverage, hardware and run-time overhead are discussed, based on the experiences gained in the development of the MEMSY fault-tolerant multiprocessor system. The cumulative effects resulting from the simultaneous use of different hardware-near and high level fault-tolerance mechanisms are shown
Keywords
error detection; fault tolerant computing; parallel machines; MEMSY fault-tolerant multiprocessor system; error detection mechanisms; fault coverage; hardware; massively parallel multiprocessors; master-checker based fault tolerance; run-time overhead; watchdog processor based fault tolerance; Application software; Computer architecture; Concurrent computing; Delay; Fault detection; Fault tolerance; Hardware; Multiprocessing systems; Redundancy; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing, 1993. Proceedings. Euromicro Workshop on
Conference_Location
Gran Canaria
Print_ISBN
0-8186-3610-6
Type
conf
DOI
10.1109/EMPDP.1993.336378
Filename
336378
Link To Document