Title :
An analysis of communication induced checkpointing
Author :
Alvisi, L. ; Elnozahy, E. ; Rao, S. ; Husain, S.A. ; de Mel, A.
Author_Institution :
Dept. of Comput. Sci., Texas Univ., Austin, TX, USA
Abstract :
Communication induced checkpointing (CIC) allows processes in a distributed computation to take independent checkpoints and to avoid the domino effect. This paper presents an analysis of CIC protocols based on a prototype implementation and validated simulations. Our result indicate that there is sufficient evidence to suspect that much of the conventional wisdom about these protocols is questionable.
Keywords :
distributed programming; protocols; system recovery; CIC; CIC protocols; distributed computation; independent checkpoints; Analytical models; Checkpointing; Computational modeling; Electrical capacitance tomography; Protocols; Prototypes; Scalability; Virtual prototyping;
Conference_Titel :
Fault-Tolerant Computing, 1999. Digest of Papers. Twenty-Ninth Annual International Symposium on
Conference_Location :
Madison, WI, USA
Print_ISBN :
0-7695-0213-X
DOI :
10.1109/FTCS.1999.781058