DocumentCode :
2302704
Title :
A low overhead checkpointing and rollback recovery scheme for distributed systems
Author :
Tong, Zhijun ; Kain, Richard Y. ; Tsai, W.T.
Author_Institution :
Dept. of Electr. Eng., Minnesota Univ., Minneapolis, MN, USA
fYear :
1989
fDate :
10-12 Oct 1989
Firstpage :
12
Lastpage :
20
Abstract :
A major obstacle in implementing a rollback recovery scheme for fault tolerance in a concurrent distributed system is the domino effect. A low overhead checkpointing scheme is proposed to prevent this effect. Each process saves its state periodically. The state-save synchronization among processes is implemented by bounding clock drifts. A communication protocol that assures that all saved states are consistent is developed
Keywords :
distributed processing; fault tolerant computing; network operating systems; protocols; system recovery; bounding clock drifts; communication protocol; concurrent distributed system; distributed systems; domino effect; fault tolerance; low overhead checkpointing; rollback recovery scheme; saved states; Checkpointing; Clocks; Computer science; Distributed computing; Fault detection; Fault tolerant systems; Power system reliability; Protocols; Radio access networks; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reliable Distributed Systems, 1989., Proceedings of the Eighth Symposium on
Conference_Location :
Seattle, WA
Print_ISBN :
0-8186-1981-3
Type :
conf
DOI :
10.1109/RELDIS.1989.72744
Filename :
72744
Link To Document :
بازگشت