DocumentCode :
2457934
Title :
HiCS: High Contrast Subspaces for Density-Based Outlier Ranking
Author :
Keller, Fabian ; Müller, Emmanuel ; Böhm, Klemens
Author_Institution :
Inst. for Program Struct. & Data Organ., Karlsruhe Inst. of Technol. (KIT), Karlsruhe, Germany
fYear :
2012
fDate :
1-5 April 2012
Firstpage :
1037
Lastpage :
1048
Abstract :
Outlier mining is a major task in data analysis. Outliers are objects that highly deviate from regular objects in their local neighborhood. Density-based outlier ranking methods score each object based on its degree of deviation. In many applications, these ranking methods degenerate to random listings due to low contrast between outliers and regular objects. Outliers do not show up in the scattered full space, they are hidden in multiple high contrast subspace projections of the data. Measuring the contrast of such subspaces for outlier rankings is an open research challenge. In this work, we propose a novel subspace search method that selects high contrast subspaces for density-based outlier ranking. It is designed as pre-processing step to outlier ranking algorithms. It searches for high contrast subspaces with a significant amount of conditional dependence among the subspace dimensions. With our approach, we propose a first measure for the contrast of subspaces. Thus, we enhance the quality of traditional outlier rankings by computing outlier scores in high contrast projections only. The evaluation on real and synthetic data shows that our approach outperforms traditional dimensionality reduction techniques, naive random projections as well as state-of-the-art subspace search techniques and provides enhanced quality for outlier ranking.
Keywords :
data analysis; data mining; conditional dependence; data analysis; density-based outlier ranking; high contrast subspace projection; outlier mining; outlier ranking algorithm; quality enhancement; subspace dimension; subspace search method; Atmospheric measurements; Correlation; Data mining; Density measurement; Joints; Noise level; Probability density function;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2012 IEEE 28th International Conference on
Conference_Location :
Washington, DC
ISSN :
1063-6382
Print_ISBN :
978-1-4673-0042-1
Type :
conf
DOI :
10.1109/ICDE.2012.88
Filename :
6228154
Link To Document :
بازگشت