Title :
Simulation analysis of a dynamic checkpointing strategy for real-time systems
Author :
Ranganathan, Aravindan ; Upadhyaya, Shambhu J.
Author_Institution :
Dept. of Electr. & Comput. Eng., State Univ. of New York, Buffalo, NY, USA
Abstract :
The performance of a fault tolerant real-time system is measured by its ability to meet deadlines in the presence of errors. Checkpointing and rollback recovery is an effective technique to tolerate transient and intermittent faults in real-time systems. Several strategies exist for checkpointing which can be broadly classified as static and dynamic. In static checkpointing, the checkpointing intervals are determined before the program execution and remain fixed until the program terminates. On the other hand, in a dynamic checkpointing strategy, the checkpointing interval is varied dynamically during program execution based on certain criteria. This paper presents a comparative study of the performance of real-time systems adopting various rollback recovery strategies. A new dynamic checkpointing strategy is presented, which integrates the deadline constraints and variable error rate in determining the checkpointing interval. The main idea behind this approach is to dynamically modify the inter-checkpointing interval during the execution of a task such that the online computation associated with checkpointing and recovery is minimized. Simulation study indicates that the proposed technique outperforms the rollback recovery techniques based on equidistant (static) checkpointing. The generic simulation tool developed to compare various rollback recovery strategies is also described
Keywords :
data integrity; fault tolerant computing; performance evaluation; real-time systems; virtual machines; checkpointing intervals; deadlines; dynamic checkpointing strategy; equidistant checkpointing; fault tolerant real-time system; generic simulation tool; intermittent faults; online computation; performance; program execution; rollback recovery strategies; simulation analysis; transient faults; variable error rate; Analytical models; Checkpointing; Computational modeling; Databases; Electric variables measurement; Error analysis; Fault tolerant systems; Job shop scheduling; Process control; Real time systems;
Conference_Titel :
Simulation Symposium, 1994., 27th Annual
Conference_Location :
La Jolla, CA
Print_ISBN :
0-8186-5620-4
DOI :
10.1109/SIMSYM.1994.283098