Title :
RS-Forest: A Rapid Density Estimator for Streaming Anomaly Detection
Author :
Ke Wu ; Kun Zhang ; Wei Fan ; Edwards, Andrea ; Yu, Philip S.
Abstract :
Anomaly detection in streaming data is of high interest in numerous application domains. In this paper, we propose a novel one-class semi-supervised algorithm to detect anomalies in streaming data. Underlying the algorithm is a fast and accurate density estimator implemented by multiple fully randomized space trees (RS-Trees), named RS-Forest. The piecewise constant density estimate of each RS-tree is defined on the tree node into which an instance falls. Each incoming instance in a data stream is scored by the density estimates averaged over all trees in the forest. Two strategies, statistical attribute range estimation of high probability guarantee and dual node profiles for rapid model update, are seamlessly integrated into RS Forestto systematically address the ever-evolving nature of data streams. We derive the theoretical upper bound for the proposed algorithm and analyze its asymptotic properties via bias-variance decomposition. Empirical comparisons to the state-of-the-art methods on multiple benchmark datasets demonstrate that the proposed method features high detection rate, fast response, and insensitivity to most of the parameter settings. Algorithm implementations and datasets are available upon request.
Keywords :
data mining; learning (artificial intelligence); security of data; statistical analysis; trees (mathematics); RS-forest; RS-trees; anomaly detection; asymptotic property; bias-variance decomposition; dual node profile; one-class semisupervised algorithm; piecewise constant density; randomized space trees; rapid density estimator; statistical attribute range estimation; streaming data; Benchmark testing; Data models; Detectors; Estimation; Predictive models; Upper bound; Vegetation; Anomaly detection; data streams; density estimation; ensembles; streaming data;
Conference_Titel :
Data Mining (ICDM), 2014 IEEE International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-4799-4303-6
DOI :
10.1109/ICDM.2014.45