Title :
Computing optimal checkpointing strategies for rollback and recovery systems
Author :
L´Ecuyer, P. ; Malenfant, Jacques
Author_Institution :
Dept of Inf., Laval Univ., Que., Canada
fDate :
4/1/1988 12:00:00 AM
Abstract :
A numerical approach for computing optimal dynamic checkpointing strategies for general rollback and recovery systems is presented. The system is modeled as a Markov renewal decision process. General failure distributions, random checkpointing durations, and reprocessing-dependent recovery times are allowed. The aim is to find a dynamic decision rule to maximize the average system availability over an infinite time horizon. A computational approach to approximate such a rule is proposed. This approach is based on value-iteration stochastic dynamic programming with spline or finite-element approximation of the value and policy functions. Numerical illustrations are provided
Keywords :
Markov processes; decision theory; dynamic programming; performance evaluation; Markov renewal decision process; dynamic decision rule; finite-element approximation; general failure distributions; numerical approach; optimal checkpointing strategies; rollback and recovery systems; value-iteration stochastic dynamic programming; Checkpointing; Database systems; Delay; Dynamic programming; Finite element methods; Frequency; Production systems; Resumes; Spline; Stochastic processes;
Journal_Title :
Computers, IEEE Transactions on