Title :
Performance analysis of two time-based coordinated checkpointing protocols
Author :
Kavanaugh, Gerard P. ; Sanders, William H.
Author_Institution :
Center for Reliable & High Performance Comput., Illinois Univ., Urbana, IL, USA
Abstract :
Time-based checkpointing protocols are a recently proposed way to improve a system´s dependability. They claim to have the advantages of coordinated protocols without the normal costs of coordination. This paper investigates that claim, by analyzing and comparing two time-based checkpointing protocols. The analysis is performed by determining the forward progress of a system using each protocol, and it is described in such a way as to be easily modifiable for other time-based protocols. By carefully analyzing the behavior of each protocol between renewal points, we are able to obtain a closed-form expression for the forward progress of the two protocols considered. We also determine the checkpoint interval value that will maximize forward progress. A validation of the analytical model is then performed via a detailed simulation. The results obtained from the model show the advantages and disadvantages of each protocol
Keywords :
digital simulation; fault tolerant computing; performance evaluation; protocols; closed-form expression; forward progress; performance analysis; system dependability; time-based coordinated checkpointing protocols; Analytical models; Checkpointing; Closed-form solution; Costs; Performance analysis; Protocols; Resumes;
Conference_Titel :
Fault-Tolerant Systems, 1997. Proceedings., Pacific Rim International Symposium on
Conference_Location :
Taipei
Print_ISBN :
0-8186-8212-4
DOI :
10.1109/PRFTS.1997.640147