DocumentCode :
935079
Title :
Use of common time base for checkpointing and rollback recovery in a distributed system
Author :
Ramanathan, Parameswaran ; Shin, Kang G.
Author_Institution :
Dept. of Electr. & Comput. Eng., Wisconsin Univ., Madison, WI, USA
Volume :
19
Issue :
6
fYear :
1993
fDate :
6/1/1993 12:00:00 AM
Firstpage :
571
Lastpage :
583
Abstract :
An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. A common time base is established in the system using a hardware clock synchronization algorithm. This common time base is coupled with the idea of pseudo-recovery points to develop a checkpointing algorithm that has the following advantages: reduced wait for commitment for establishing recovery lines, fewer messages to be exchanged, and less memory requirement. These advantages are assessed quantitatively by developing a probabilistic model
Keywords :
distributed processing; fault tolerant computing; system recovery; checkpointing; common time base; distributed system; hardware clock synchronization algorithm; memory requirement; message exchange; probabilistic model; pseudo-recovery points; recovery lines; rollback recovery; Checkpointing; Clocks; Distributed computing; Fault tolerant systems; Hardware; NASA; Real time systems; Resumes; Synchronization; Testing;
fLanguage :
English
Journal_Title :
Software Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
0098-5589
Type :
jour
DOI :
10.1109/32.232022
Filename :
232022
Link To Document :
بازگشت