Title :
Maximum and minimum consistent global checkpoints and their applications
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
Abstract :
This paper considers the problem of constructing the maximum and the minimum consistent global checkpoints that contain a target set of checkpoints, and identify it as a generic issue in recovery-related applications. We formulate the problem as a reachability analysis problem on a directed rollback-dependency graph, and develop efficient algorithms to calculate the two consistent global checkpoints for both general nondeterministic executions and piecewise deterministic executions. We also demonstrate that the approach provides a generalization and unifying framework for many existing and potential applications including software error recovery, mobile computing recovery, parallel debugging and output commits
Keywords :
program debugging; reachability analysis; software fault tolerance; consistent global checkpoints; directed rollback-dependency graph; general nondeterministic executions; generic issue; mobile computing recovery; output commits; parallel debugging; piecewise deterministic executions; reachability analysis problem; recovery-related applications; software error recovery; unifying framework; Application software; Checkpointing; Concurrent computing; Hardware; Mobile computing; Nonvolatile memory; Power system modeling; Protocols; Reachability analysis; Software debugging;
Conference_Titel :
Reliable Distributed Systems, 1995. Proceedings., 14th Symposium on
Conference_Location :
Bad Neuenahr
Print_ISBN :
0-8186-7153-X
DOI :
10.1109/RELDIS.1995.526216