Title :
How safe is probabilistic checkpointing?
Author_Institution :
Res. Lab., IBM Corp., Austin, TX, USA
Abstract :
Probabilistic checkpointing has been introduced recently as a technique for implementing incremental checkpointing. Supposedly, this new technique is better than traditional ones because it is portable, more efficient, and has lower storage requirements. On the downside, the technique may produce erroneous checkpoints due to a problem called aliasing. However, a probabilistic analysis has shown that the likelihood of aliasing in practice is negligible. This paper presents an empirical study showing that aliasing occurs more frequently than the probabilistic analysis has estimated. The results refute the previous claims about probabilistic checkpointing and establish that it is unsafe in practice.
Keywords :
probability; software fault tolerance; system recovery; aliasing; erroneous checkpoints; incremental checkpointing; probabilistic checkpointing; storage requirements; Checkpointing; Emulation; Fault tolerance; Hardware; Kernel; Memory management; Portable computers; Protection; Resumes; Runtime;
Conference_Titel :
Fault-Tolerant Computing, 1998. Digest of Papers. Twenty-Eighth Annual International Symposium on
Conference_Location :
Munich, Germany
Print_ISBN :
0-8186-8470-4
DOI :
10.1109/FTCS.1998.689486