Title :
An Improved Outlier Detection Method in High-dimension Based on Weighted Hypergraph
Author :
Li, Yinzhao ; Wu, Di ; Ren, Jiadong ; Hu, Changzhen
Author_Institution :
Lab. of Comput. Network Defense Lechnology, Beijing Inst. of Lechnology, Beijing, China
Abstract :
Outlier detection in high-dimensional space is a hot topic in data mining, the main goal is to find out a small quantity of data objects with abnormal behavior in data set. In this paper, the concepts of the feature vector and the attribute similarity are defined, an improved algorithm SWHOT based on weighed hypergraph model for outlier detection in high dimensional space is presented. The objects in high dimensional space are translated into binary data type, by looking for the hyperedge of binary set, the data set hypergarph model is established, meanwhile, the weight of the hyperedge is equal to the value of the attribute similarity. In addition, the objects of the hypergraph are clustered by CURE algorithm, arbitrary shaped clusters can be identified. Furthermore, the outliers are found according to the point-to-window weighted support, the point-to-class belongingness and the point-to-window weighted deviation of size, the meaningful outliers in high-dimension can be mined by means of appropriate user-defined threshold. Experimental results show that SWHOT can improve scaling and precision.
Keywords :
data mining; graph theory; pattern classification; pattern clustering; set theory; support vector machines; CURE algorithm; SVDD; SWHOT algorithm; arbitrary shaped cluster; attribute similarity; binary data type translation; binary set hyperedge; data mining; data set abnormal behavior; feature vector; high-dimensional space; outlier detection method; point-to-class belongingness; point-to-window weighted deviation-of-size; point-to-window weighted support; support vector domain description; user-defined threshold; weighed hypergraph model; Clustering algorithms; Computer networks; Data engineering; Data mining; Educational institutions; Information science; Object detection; Partitioning algorithms; Space technology; Weather forecasting; clustering; hypergraph; outlier detection; similarity; weight;
Conference_Titel :
Electronic Commerce and Security, 2009. ISECS '09. Second International Symposium on
Conference_Location :
Nanchang
Print_ISBN :
978-0-7695-3643-9
DOI :
10.1109/ISECS.2009.54