Title :
Detecting outliers in sliding window over categorical data streams
Author :
QunHui Wu ; Shilong Ma
Author_Institution :
State Key Lab. of Software Dev. Environ., Beijing Univ. of Aeronaut. & Astronaut., Beijing, China
Abstract :
Outlier mining is an important and active research issue in anomaly detection. However, it is a difficult problem since categorical data arrive at a fast rate, some data may be outdated and the outliers identified are likely to change. In this paper, we propose an efficient algorithm for mining outliers from categorical data streams, which discover closed frequent patterns in sliding window first. Then WCFPOF (Weighted Closed Frequent Pattern Outlier Factor) is introduced to measure the complete categorical data, and the corresponding candidate outliers are stored in QIS (Query Indexed Structure). By employing the decayed function, the outdated outliers are faded to generate the final outliers. Experimental results show that our algorithm has higher detection precision than FindFPOF. Otherwise, our algorithm has better scalability with different data sizes.
Keywords :
data mining; database indexing; fault tolerant computing; query processing; very large databases; anomaly detection; categorical data streams; decayed function; outlier detection; outlier mining; query indexed structure; sliding window; weighted closed frequent pattern outlier factor; Clustering algorithms; Computational efficiency; Data mining; Data structures; Itemsets; Partitioning algorithms; Scalability; categorical data streams; closed frequent pattern; outlier detection; sliding window;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
DOI :
10.1109/FSKD.2011.6019780