DocumentCode :
327894
Title :
Distributed checkpoint algorithms to avoid roll-back propagation
Author :
Zambonelli, Franco
Author_Institution :
Dipt. di Sci. dell´´Ingegneria, Modena Univ., Italy
Volume :
1
fYear :
1998
fDate :
25-27 Aug 1998
Firstpage :
403
Abstract :
Checkpointing is a very well known mechanism to achieve fault tolerance. In distributed applications, a local checkpoint is useful for fault tolerance purposes only if can belong to at least one consistent global checkpoint and then, execution can be restarted from it without needing to roll back the execution in the past. The paper introduces a theoretical framework that facilitates the definition and the analysis of distributed checkpoint algorithms to avoid roll backpropagation. On this base, several algorithms are presented and evaluated in a set of testbed applications
Keywords :
distributed algorithms; message passing; software fault tolerance; checkpointing; consistent global checkpoint; distributed applications; distributed checkpoint algorithms; fault tolerance; local checkpoint; roll back propagation; roll backpropagation; testbed applications; theoretical framework; Algorithm design and analysis; Checkpointing; Computational modeling; Distributed computing; Fault tolerance; Force control; Process control; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Euromicro Conference, 1998. Proceedings. 24th
Conference_Location :
Vasteras
ISSN :
1089-6503
Print_ISBN :
0-8186-8646-4
Type :
conf
DOI :
10.1109/EURMIC.1998.711833
Filename :
711833
Link To Document :
بازگشت