DocumentCode :
2875624
Title :
GAUL: Gestalt Analysis of Unstructured Logs for Diagnosing Recurring Problems in Large Enterprise Storage Systems
Author :
Zhou, Pin ; Gill, Binny ; Belluomini, Wendy ; Wildani, Avani
Author_Institution :
IBM Almaden Res., San Jose, CA, USA
fYear :
2010
fDate :
Oct. 31 2010-Nov. 3 2010
Firstpage :
148
Lastpage :
159
Abstract :
We present GAUL, a system to automate the whole log comparison between a new problem and the ones diagnosed in the past to identify recurring problems. GAUL uses a fuzzy match algorithm based on the contextual overlap between log lines and efficiently implements this using scalable index/search. The accuracy and efficiency of the comparison is further improved by leveraging problem set information and noise tolerance techniques. We evaluate GAUL using 4339 customer problems that occurred in all field deployments of an enterprise storage system over the course of a year. Our results show that with human-filtered logs, GAUL can identify the correct problem set 66% of the time among the top10 matches, which is 15% more accurate than the VSM system that uses cosine similarity and 19% more accurate than the ERRCMP system that uses error codes for log comparison. With unfiltered logs, the top10 match accuracy of GAUL is 40%, which is 22% more accurate than VSM and 26% more accurate than ERRCMP.
Keywords :
business data processing; fuzzy set theory; pattern matching; program diagnostics; ERRCMP system; GAUL system; Gestalt analysis; contextual overlap; cosine similarity; error codes; fuzzy match algorithm; human-filtered log; large enterprise storage system; leveraging problem set information; log comparison; log lines; noise tolerance technique; recurring problem diagnosis; recurring problem identification; scalable index; scalable search; unstructured logs; Accuracy; Hardware; Humans; Indexes; Microprogramming; Noise; Search problems; Problem diagnosis; fuzzy match; index; search; whole log comparison;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reliable Distributed Systems, 2010 29th IEEE Symposium on
Conference_Location :
New Delhi
ISSN :
1060-9857
Print_ISBN :
978-0-7695-4250-8
Type :
conf
DOI :
10.1109/SRDS.2010.25
Filename :
5623389
Link To Document :
بازگشت