Title :
Consistent global checkpoints that contain a given set of local checkpoints
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
fDate :
4/1/1997 12:00:00 AM
Abstract :
In this paper, we consider the problem of constructing consistent global checkpoints that contain a given set of checkpoints. We address three important issues related to this problem. First, we define the maximum and minimum consistent global checkpoints containing a set S, and give algorithms to construct them. These algorithms are based on reachability analysis on a rollback-dependency graph. Second, we introduce a concept called “rollback-dependency tractability” that enables this analysis to be performed efficiently for a certain class of checkpoint and communication models. We define the least stringent of these models (“FDAS”), and put it in context with other models defined in the literature. Significant in this is a way to use FDAS to provide efficient rollback recovery for applications that do not satisfy perfect piecewise determinism. Finally, we describe several applications of the theorems and algorithms derived in this paper to demonstrate the capability of our approach to unify, generalize, and extend many previous works
Keywords :
concurrency control; fault tolerant computing; program debugging; reachability analysis; FDAS; consistent global checkpoints; local checkpoints; reachability analysis; rollback-dependency graph; rollback-dependency tractability; Application software; Checkpointing; Context modeling; Debugging; Fault tolerant systems; Hardware; Nonvolatile memory; Performance analysis; Reachability analysis; System recovery;
Journal_Title :
Computers, IEEE Transactions on