Title :
Outlier detection on uncertain data: Objects, instances, and inferences
Author :
Jiang, Bin ; Pei, Jian
Author_Institution :
Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada
Abstract :
This paper studies the problem of outlier detection on uncertain data. We start with a comprehensive model considering both uncertain objects and their instances. An uncertain object has some inherent attributes and consists of a set of instances which are modeled by a probability density distribution. We detect outliers at both the instance level and the object level. To detect outlier instances, it is a prerequisite to know normal instances. By assuming that uncertain objects with similar properties tend to have similar instances, we learn the normal instances for each uncertain object using the instances of objects with similar properties. Consequently, outlier instances can be detected by comparing against normal ones. Furthermore, we can detect outlier objects most of whose instances are outliers. Technically, we use a Bayesian inference algorithm to solve the problem, and develop an approximation algorithm and a filtering algorithm to speed up the computation. An extensive empirical study on both real data and synthetic data verifies the effectiveness and efficiency of our algorithms.
Keywords :
Bayes methods; data handling; inference mechanisms; statistical distributions; Bayesian inference algorithm; approximation algorithm; filtering algorithm; instance level; object level; outlier detection; probability density distribution; uncertain data; uncertain objects; Approximation algorithms; Approximation methods; Equations; Inference algorithms; Kernel; Mathematical model; Temperature measurement;
Conference_Titel :
Data Engineering (ICDE), 2011 IEEE 27th International Conference on
Conference_Location :
Hannover
Print_ISBN :
978-1-4244-8959-6
Electronic_ISBN :
1063-6382
DOI :
10.1109/ICDE.2011.5767850