DocumentCode :
2396329
Title :
A hybrid distance-based outlier detection approach
Author :
Huang, Yanyan ; Zhang, Zhongnan ; Liao, Minghong ; Tan, Yize ; Zhou, Shaobin
Author_Institution :
Software Sch., Xiamen Univ., Xiamen, China
fYear :
2012
fDate :
19-20 May 2012
Firstpage :
2212
Lastpage :
2216
Abstract :
Most real-world datasets have outliers. Outliers can imply abnormal states that often indicate significant performance degradation or danger in certain circumstances. Therefore, the outlier detection plays an important role in the field of data mining. This paper proposes a hybrid distance-based outlier detection approach. It uses the average distance as neighborhood distance, and records the number of data objects within the neighborhood. Therefore, the average number of neighbors can be calculated. Using this average value as a threshold, the data set can be divided into two parts: non-outlier data set and candidate data set. Calculating the distances between a candidate object and its k-nearest neighbors can filter out outliers from the candidate data set. Experimental results show that the approach can effectively detect outliers.
Keywords :
data mining; pattern clustering; average distance; candidate data set; candidate object; data mining; data objects; hybrid distance-based outlier detection approach; k-nearest neighbors; neighborhood distance; nonoutlier data set; performance degradation; Algorithm design and analysis; Data mining; Data models; Data preprocessing; Educational institutions; Euclidean distance; Partitioning algorithms; average distance; average number of neighbors; outlier detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems and Informatics (ICSAI), 2012 International Conference on
Conference_Location :
Yantai
Print_ISBN :
978-1-4673-0198-5
Type :
conf
DOI :
10.1109/ICSAI.2012.6223490
Filename :
6223490
Link To Document :
بازگشت