Title :
Locality Sensitive Outlier Detection: A ranking driven approach
Author :
Wang, Ye ; Parthasarathy, Srinivasan ; Tatikonda, Shirish
Author_Institution :
Comput. Sci. & Eng. Dept., Ohio State Univ., Columbus, OH, USA
Abstract :
Outlier detection is fundamental to a variety of database and analytic tasks. Recently, distance-based outlier detection has emerged as a viable and scalable alternative to traditional statistical and geometric approaches. In this article we explore the role of ranking for the efficient discovery of distance-based outliers from large high dimensional data sets. Specifically, we develop a light-weight ranking scheme that is powered by locality sensitive hashing, which reorders the database points according to their likelihood of being an outlier. We provide theoretical arguments to justify the rationale for the approach and subsequently conduct an extensive empirical study highlighting the effectiveness of our approach over extant solutions. We show that our ranking scheme improves the efficiency of the distance-based outlier discovery process by up to 5-fold. Furthermore, we find that using our approach the top outliers can often be isolated very quickly, typically by scanning less than 3% of the data set.
Keywords :
data handling; file organisation; distance-based outlier detection; distance-based outlier discovery process; light-weight ranking scheme; locality sensitive hashing; locality sensitive outlier detection; ranking driven approach; Algorithm design and analysis; Approximation algorithms; Artificial neural networks; Clustering algorithms; Databases; Nearest neighbor searches; Optimization;
Conference_Titel :
Data Engineering (ICDE), 2011 IEEE 27th International Conference on
Conference_Location :
Hannover
Print_ISBN :
978-1-4244-8959-6
Electronic_ISBN :
1063-6382
DOI :
10.1109/ICDE.2011.5767852