DocumentCode
2705881
Title
Addressing software dependability with statistical and machine learning techniques
Author
Fox, Armando
Author_Institution
Stanford Univ., CA, USA
fYear
2005
fDate
15-21 May 2005
Firstpage
8
Abstract
Summary form only given. Our ability to design and deploy large complex systems is outpacing our ability to understand their behavior. How do we detect and recover from "heisenbugs", which account for up to 40% of failures in complex Internet systems, without extensive application-specific coding? Which users were affected, and for how long? How do we diagnose and correct problems caused by configuration errors or operator errors? Although these problems are posed at a high level of abstraction, all we can usually measure directly are low-level behaviors - analogous to driving a car while looking through a magnifying glass. Machine learning can bridge this gap using techniques that learn "baseline" models automatically or semi-automatically, allowing the characterization and monitoring of systems whose structure is not well understood a priori. This paper discusses initial successes and future challenges in using machine learning for failure detection and diagnosis, configuration troubleshooting, attribution (which low-level properties appear to be correlated with an observed high-level effect such as decreased performance), and failure forecasting.
Keywords
learning (artificial intelligence); program diagnostics; software reliability; statistical analysis; configuration troubleshooting; machine learning; software dependability; software failure detection; software failure diagnosis; software failure forecasting; statistical techniques; system monitoring; Bridges; Computerized monitoring; Condition monitoring; Error correction; Glass; Internet; Machine learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering, 2005. ICSE 2005. Proceedings. 27th International Conference on
Print_ISBN
1-59593-963-2
Type
conf
DOI
10.1109/ICSE.2005.1553531
Filename
1553531
Link To Document