DocumentCode
2445224
Title
DB-Outlier Detection by Example in High Dimensional Datasets
Author
Li, Yuan ; Kitagawa, Hiroyuki
Author_Institution
Grad. Sch. of Syst. & Inf. Eng., Tsukuba Univ., Tsukuba
fYear
2007
fDate
15-15 April 2007
Firstpage
73
Lastpage
78
Abstract
Outlier detection is an important problem with applications in many fields. Such applications generally process high dimensional datasets. Among the existing methods of detecting outliers, Distance-Based outlier (DB-Outlier) detection is one of the most commonly used and simplest approaches, since it detects outliers only by calculating distances between data points. However, in high dimensional space, data is sparse, so every data point becomes a good outlier candidate. A Subspace-Based method has been proposed to deal with the curse of dimensions. It shows that meaningful outliers are likely to be identified by examining the behavior of data in low dimensional projections. On the other hand, most existing methods detect outliers with parameters being determined by users in advance. Such parameters usually contain hidden user view of outliers. Example-Based outlier detection methods are presented to be promising in discovering the hidden user view of outliers. In this paper, we discuss a new technique to detect DB-Outliers in high dimensional datasets based on user examples. Our proposed method makes use of Subspace-Based and Example-Based methods to discover a subspace where user examples are outstanding more significantly than in any other subspaces, and reports DB-Outliers detected in this subspace.
Keywords
data mining; very large databases; data mining; distance-based outlier detection; example-based outlier detection; high dimensional dataset; subspace-based outlier detection; Clustering algorithms; Credit cards; Data engineering; Data mining; Density measurement; Extraterrestrial measurements; Object detection; Robustness; Systems engineering and theory;
fLanguage
English
Publisher
ieee
Conference_Titel
Databases for Next Generation Researchers, 2007. SWOD 2007. IEEE International Workshop on
Conference_Location
Istanbul
Print_ISBN
1-4244-0903-9
Electronic_ISBN
1-4244-0904-7
Type
conf
DOI
10.1109/SWOD.2007.353201
Filename
4163065
Link To Document