Title :
Efficient transparent optimistic rollback recovery for distributed application programs
Author :
Johnson, David B.
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
A transparent rollback-recovery method that adds very little overhead to distributed application programs and efficiently supports the quick commit of all output to the outside world is introduced. Each process can independently choose at any time either to use checkpointing alone (as in consistent checkpointing) or to use optimistic message logging. The system is based on a new commit algorithm that requires communication with and information about the minimum number of other processes in the system, and supports the recovery of both deterministic and nondeterministic processes
Keywords :
distributed processing; fault tolerant computing; software fault tolerance; system recovery; checkpointing; commit algorithm; distributed application programs; nondeterministic processes; optimistic message logging; transparent rollback-recovery method; Application software; Checkpointing; Computer science; Concurrent computing; Contracts; Fault tolerance; Information science; Optimization methods; Programming profession; Propagation delay;
Conference_Titel :
Reliable Distributed Systems, 1993. Proceedings., 12th Symposium on
Conference_Location :
Princeton, NJ
Print_ISBN :
0-8186-4310-2
DOI :
10.1109/RELDIS.1993.393470