DocumentCode
2145003
Title
A Reinforcement-Learning Approach to Failure-Detection Scheduling
Author
Zeng, Fancong
Author_Institution
BEA Syst., Inc., Liberty Corner
fYear
2007
fDate
11-12 Oct. 2007
Firstpage
161
Lastpage
170
Abstract
A failure-detection scheduler for an online production system must strike a tradeoff between performance and reliability. If failure-detection processes are run too frequently, valuable system resources are spent checking and rechecking for failures. However, if failure-detection processes are run too rarely, a failure can remain undetected for a long time. In both cases, system performability suffers. We present a model-based learning approach that estimates the failure rate and then performs an optimization to find the tradeoff that maximizes system performability. We show that our approach is not only theoretically sound but practically effective, and we demonstrate its use in an implemented automated deadlock-detection system for Java.
Keywords
decision theory; learning (artificial intelligence); scheduling; software reliability; system recovery; Java; automated deadlock detection; decision theory; failure detection scheduling; online production system; optimization; reinforcement learning; software reliability; Convergence; Costs; Exponential distribution; Frequency; Java; Production systems; Scheduling; System performance; System recovery; Upper bound;
fLanguage
English
Publisher
ieee
Conference_Titel
Quality Software, 2007. QSIC '07. Seventh International Conference on
Conference_Location
Portland, OR
ISSN
1550-6002
Print_ISBN
978-0-7695-3035-2
Type
conf
DOI
10.1109/QSIC.2007.4385492
Filename
4385492
Link To Document