DocumentCode :
2379348
Title :
On staggered checkpointing
Author :
Vaidya, Nitin H.
Author_Institution :
Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA
fYear :
1996
fDate :
23-26 Oct 1996
Firstpage :
572
Lastpage :
580
Abstract :
A consistent checkpointing algorithm serves a consistent view of a distributed application´s state on stable storage. The traditional consistent checkpointing algorithms require different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce checkpoint overhead. The paper presents a simple approach to arbitrarily stagger the checkpoints. The approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. Experimental results on nCube-2 are presented
Keywords :
distributed algorithms; distributed memory systems; fault tolerant computing; hypercube networks; reliability; system recovery; checkpoint overhead reduction; consistent checkpointing algorithm; consistent logical checkpoints; distributed application state; nCube-2; stable storage; staggered checkpointing; Checkpointing; Communication system control; Computer science; Degradation; Delay; Frequency; Upper bound;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing, 1996., Eighth IEEE Symposium on
Conference_Location :
New Orleans, LA
Print_ISBN :
0-8186-7683-3
Type :
conf
DOI :
10.1109/SPDP.1996.570386
Filename :
570386
Link To Document :
بازگشت