Title :
Unsupervised Class Separation of Multivariate Data through Cumulative Variance-Based Ranking
Author :
Foss, Andrew ; Zaiane, Osmar R. ; Zilles, Sandra
Author_Institution :
Dept. of Comput. Sci., Univ. of Alberta, Edmonton, AB, Canada
Abstract :
This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the `rare classes´ case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods.
Keywords :
security of data; unsupervised learning; LOF; ORCA; SOE1; cumulative variance-based ranking; feature bagging; multivariate data; outlier detection approaches; robust Mahalanobis distance; unsupervised class separation; Bagging; Detection algorithms; Robustness; Classification; Outlier Detection; Subspaces;
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
DOI :
10.1109/ICDM.2009.17