DocumentCode :
2261707
Title :
Optimizing system monitoring configurations for non-actionable alerts
Author :
Liang Tang ; Tao Li ; Pinel, Frederic ; Shwartz, Larisa ; Grabarnik, Genady
Author_Institution :
Sch. of Comput. Sci., Florida Int. Univ., Miami, FL, USA
fYear :
2012
fDate :
16-20 April 2012
Firstpage :
34
Lastpage :
42
Abstract :
Today´s competitive business climate and the complexity of IT environments dictate efficient and cost effective service delivery and support of IT services. This is largely achieved through automating of routine maintenance procedures including problem detection, determination and resolution. System monitoring provides effective and reliable means for problem detection. Coupled with automated ticket creation, it ensures that a degradation of the vital signs, defined by acceptable thresholds or monitoring conditions, is flagged as a problem candidate and sent to supporting personnel as an incident ticket. This paper describes a novel methodology and a system for minimizing non-actionable tickets while preserving all tickets which require corrective action. Our proposed method defines monitoring conditions and the optimal corresponding delay times based on an off-line analysis of historical alerts and the matching incident tickets. Potential monitoring conditions are built on a set of predictive rules which are automatically generated by a rule-based learning algorithm with coverage, confidence and rule complexity criteria. These conditions and delay times are propagated as configurations into run-time monitoring systems.
Keywords :
computational complexity; knowledge based systems; learning (artificial intelligence); software maintenance; system monitoring; IT environments; IT services; automated ticket creation; automating routine maintenance procedures; confidence criteria; corrective action; cost effective service delivery; coverage criteria; delay times; historical alerts; incident ticket; incident ticket matching; monitoring conditions; nonactionable alerts; nonactionable ticket minimization; offline analysis; problem detection; problem determination; problem resolution; rule complexity criteria; rule-based learning algorithm; run-time monitoring systems; system monitoring configuration optimization; Accuracy; Delay; Monitoring; Prediction algorithms; Servers; Testing; Transient analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Network Operations and Management Symposium (NOMS), 2012 IEEE
Conference_Location :
Maui, HI
ISSN :
1542-1201
Print_ISBN :
978-1-4673-0267-8
Electronic_ISBN :
1542-1201
Type :
conf
DOI :
10.1109/NOMS.2012.6211880
Filename :
6211880
Link To Document :
بازگشت