DocumentCode :
1987720
Title :
KNN Based Outlier Detection Algorithm in Large Dataset
Author :
Yang, Peng ; Huang, Biao
Author_Institution :
Chongqing Univ. of Arts & Sci., Chongqing
Volume :
1
fYear :
2008
fDate :
21-22 Dec. 2008
Firstpage :
611
Lastpage :
613
Abstract :
An outlier is the object which is very different from the rest of the dataset on some measure. Finding such exception has received much attention in the data mining field. In this paper, we propose a KNN based outlier detection algorithm which is consisted of two phases. Firstly, it partitions the dataset into several clusters and then in each cluster, it calculates the Kth nearest neighborhood for object to find outliers. In addition, the pruning scheme is used in our algorithm. It can effectively avoid frequent passing the entire dataset and unnecessary computations. Experimental results on both synthetic and real life datasets show that our algorithm is efficient for outlier detection in large dataset.
Keywords :
data mining; K-nearest neighborhood; KNN based outlier detection; data mining; Art; Clustering algorithms; Data mining; Detection algorithms; Educational technology; Geoscience and remote sensing; Kernel; Object detection; Partitioning algorithms; Principal component analysis; data mining; knn; outlier detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Education Technology and Training, 2008. and 2008 International Workshop on Geoscience and Remote Sensing. ETT and GRS 2008. International Workshop on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3563-0
Type :
conf
DOI :
10.1109/ETTandGRS.2008.306
Filename :
5070231
Link To Document :
بازگشت