DocumentCode :
2601046
Title :
Lazy checkpoint coordination for bounding rollback propagation
Author :
Wang, Yi-Min ; Fuchs, W. Kent
Author_Institution :
Univ. of Illinois at Urbana-Champaign, IL, USA
fYear :
1993
fDate :
6-8 Oct 1993
Firstpage :
78
Lastpage :
85
Abstract :
The technique of lazy checkpoint coordination, which preserves process autonomy while employing communication-induced checkpoint coordination for bounding rollback propagation is proposed. The notion of laziness is introduced to control the coordination frequency and allow a flexible tradeoff between the cost of checkpoint coordination and the average rollback distance. Worst-case overhead analysis provides a means for estimating the extra checkpoint overhead. Communication trace-driven simulation for several parallel programs is used to evaluate the benefits of the proposed scheme
Keywords :
fault tolerant computing; parallel programming; system monitoring; system recovery; average rollback distance; checkpoint overhead; communication-induced checkpoint coordination; coordination frequency; lazy checkpoint coordination; parallel programs; process autonomy; rollback propagation; Checkpointing; Contracts; Costs; Frequency measurement; History; Laboratories; Message passing; NASA; Performance evaluation; Runtime;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reliable Distributed Systems, 1993. Proceedings., 12th Symposium on
Conference_Location :
Princeton, NJ
Print_ISBN :
0-8186-4310-2
Type :
conf
DOI :
10.1109/RELDIS.1993.393471
Filename :
393471
Link To Document :
بازگشت