DocumentCode :
2842702
Title :
Mining unstructured log files for recurrent fault diagnosis
Author :
Reidemeister, Thomas ; Jiang, Miao ; Ward, Paul A S
Author_Institution :
E&CE Dept., Univ. of Waterloo, Waterloo, ON, Canada
fYear :
2011
fDate :
23-27 May 2011
Firstpage :
377
Lastpage :
384
Abstract :
Enterprise software systems are large and complex with limited support for automated root-cause analysis. Avoiding system downtime and loss of revenue dictates a fast and efficient root-cause analysis process. Operator practice and academic research have shown that about 80% of failures in such systems have recurrent causes; therefore, significant efficiency gains can be achieved by automating their identification. In this paper, we present a novel approach to modelling features of log files. This model offers a compact representation of log data that can be efficiently extracted from large amounts of monitoring data. We also use decision-tree classifiers to learn and classify symptoms of recurrent faults. This representation enables automated fault matching and, in addition, enables human investigators to understand manifestations of failure easily. Our model does not require any access to application source code, a specification of log messages, or deep application knowledge. We evaluate our proposal using fault-injection experiments against other proposals in the field. First, we show that the features needed for symptom definition can be extracted more efficiently than does related work. Second, we show that these features enable an accurate classification of recurrent faults using only standard machine learning techniques. This enables us to identify accurately up to 78% of the faults in our evaluation data set.
Keywords :
business data processing; data mining; data structures; decision trees; fault diagnosis; learning (artificial intelligence); pattern classification; automated fault matching; automated root-cause analysis; decision-tree classifiers; enterprise software systems; fault-injection experiments; log data representation; machine learning techniques; recurrent fault diagnosis; revenue loss; unstructured log file mining; Atmospheric measurements; Manuals; Monitoring; Particle measurements; Servers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Integrated Network Management (IM), 2011 IFIP/IEEE International Symposium on
Conference_Location :
Dublin
Print_ISBN :
978-1-4244-9219-0
Electronic_ISBN :
978-1-4244-9220-6
Type :
conf
DOI :
10.1109/INM.2011.5990536
Filename :
5990536
Link To Document :
بازگشت