Title :
Recovery in Multicomputers with Finite Error Detection Latency
Author :
Krishna, P. ; Vaidya, N.H. ; Pradhan, D.K.
Abstract :
In most research on checkpointing and recovery, it has been assumed that the processor halts immediately in response to any internal failure (fail-stop model). This paper presents a recovery scheme (independent checkpointing and message logging) for a multicomputer system consisting of processors having a non-zero error detection latency. Our scheme tolerates bounded error detection latencies, thus, achieving a higher fault coverage. The simulation results show that for typical detection latency values, the recovery overhead is almost independent of the detection latency.
Conference_Titel :
Parallel Processing, 1994. ICPP 1994 Volume 2. International Conference on
Conference_Location :
North Carolina, USA
Print_ISBN :
0-8493-2493-9
DOI :
10.1109/ICPP.1994.174