DocumentCode
1470006
Title
Optimal strategies for scheduling checkpoints and preventive maintenance
Author
Coffman, E.G., Jr. ; Gilbert, E.N.
Author_Institution
AT$RT Bell Lab., Murray Hill, NJ, USA
Volume
39
Issue
1
fYear
1990
fDate
4/1/1990 12:00:00 AM
Firstpage
9
Lastpage
18
Abstract
At checkpoints during the operation of a computer, the state of the system is saved. Whenever a machine fails, it is repaired and then reset to the state saved at the latest checkpoint. In the present work, save times are known constants and repair times are random variables; failures are the epochs of a given renewal process. In scheduling the checkpoints, the cost of saves must be traded off against the cost of work lost when the computer fails. It is shown how to schedule checkpoints to minimize the mean total time to finish a given job. Similar optimization results are obtained for the tails of the distribution of the finishing time. Two variants of the basic model are considered. In one of the computer receives maintenance during each save; in the other it does not. Applications to the M/G/1 queuing system are touched on
Keywords
computer maintenance; queueing theory; scheduling; M/G/1 queuing system; checkpoint scheduling; computer operation; failures; optimal strategies; optimization; preventive maintenance; repair times; save times; Application software; Checkpointing; Computer applications; Costs; Finishing; Mathematical model; Preventive maintenance; Probability distribution; Processor scheduling; Random variables;
fLanguage
English
Journal_Title
Reliability, IEEE Transactions on
Publisher
ieee
ISSN
0018-9529
Type
jour
DOI
10.1109/24.52636
Filename
52636
Link To Document