DocumentCode :
580143
Title :
Real-Time Anomaly Detection in Streams of Execution Traces
Author :
Zhang, Wenke ; Bastani, Favyen ; Yen, I-Ling ; Hulin, Kevin ; Bastani, Farokh ; Khan, Latifur
fYear :
2012
fDate :
25-27 Oct. 2012
Firstpage :
32
Lastpage :
39
Abstract :
For deployed systems, software fault detection can be challenging. Generally, faulty behaviors are detected based on execution logs, which may contain a large volume of execution traces, making analysis extremely difficult. This paper investigates and compares the effectiveness and efficiency of various data mining techniques for software fault detection based on execution logs, including clustering based, density based, and probabilistic automata based methods. However, some existing algorithms suffer from high complexity and do not scale well to large datasets. To address this problem, we present a suite of prefix tree based anomaly detection techniques. The prefix tree model serves as a compact loss less data representation of execution traces. Also, the prefix tree distance metric provides an effective heuristic to guide the search for execution traces having close proximity to each other. In the density based algorithm, the prefix tree distance is used to confine the K-nearest neighbor search to a small subset of the nodes, which greatly reduces the computing time without sacrificing accuracy. Experimental studies show a significant speedup in our prefix tree based and prefix tree distance guided approaches, from days to minutes in the best cases, in automated identification of software failures.
Keywords :
data mining; pattern classification; pattern clustering; probabilistic automata; program diagnostics; software fault tolerance; tree data structures; K-nearest neighbor search; compact loss less data representation; data mining techniques; density based methods; execution logs; execution traces; high complexity; prefix tree based anomaly detection techniques; prefix tree distance metric model; probabilistic automata based methods; real-time anomaly detection; software failures; software fault detection; Algorithm design and analysis; Automata; Clustering algorithms; Data models; Probabilistic logic; Software; Software algorithms; Anomaly detection; k-medoids clustering; local outlier factor; prefix tree; probabilistic automata;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High-Assurance Systems Engineering (HASE), 2012 IEEE 14th International Symposium on
Conference_Location :
Omaha, NE
ISSN :
1530-2059
Print_ISBN :
978-1-4673-4742-6
Type :
conf
DOI :
10.1109/HASE.2012.13
Filename :
6375634
Link To Document :
بازگشت