Title :
Annotated Minimum Volume Sets for Nonparametric Anomaly Discovery
Author :
Scott, Clayton D. ; Kolaczyk, Eric D.
Author_Institution :
University of Michigan, Dept. of Elec. Eng. and Comp. Sci., Ann Arbor, MI 48105
Abstract :
We consider an anomaly detection problem, wherein a combination of typical and anomalous data are observed and it is necessary to identify the anomalies in this particular dataset without recourse to labeled exemplars. We take as our goal to produce an annotated ranking of the observations, indicating the relative priority for each to be examined further as a possible anomaly, while making no assumptions on the distribution of typical data. We propose a framework in which each observation is linked to a corresponding minimum volume set and, implicitly adopting a hypothesis testing perspective, each set is associated with a test. An inherent ordering of these sets yields a natural ranking, while the association of each test with a false discovery rate yields an appropriate annotation. The combination of minimum volume set methods with false discovery rate principles, in the context of data contaminated by anomalies, is new and estimation of the key underlying quantities requires that a number of issues be addressed. We offer some solutions to the relevant estimation problems, and illustrate the proposed methodology on synthetic and computer network traffic data.
Keywords :
Computer networks; IP networks; Level set; Mathematics; Pollution measurement; Statistics; Telecommunication traffic; Testing; Training data; Volume measurement; false discovery rate; minimum volume sets; monotone density estimation; multiple level set estimation; nonparametric outlier detection;
Conference_Titel :
Statistical Signal Processing, 2007. SSP '07. IEEE/SP 14th Workshop on
Conference_Location :
Madison, WI, USA
Print_ISBN :
978-1-4244-1198-6
Electronic_ISBN :
978-1-4244-1198-6
DOI :
10.1109/SSP.2007.4301254