DocumentCode
2396329
Title
A hybrid distance-based outlier detection approach
Author
Huang, Yanyan ; Zhang, Zhongnan ; Liao, Minghong ; Tan, Yize ; Zhou, Shaobin
Author_Institution
Software Sch., Xiamen Univ., Xiamen, China
fYear
2012
fDate
19-20 May 2012
Firstpage
2212
Lastpage
2216
Abstract
Most real-world datasets have outliers. Outliers can imply abnormal states that often indicate significant performance degradation or danger in certain circumstances. Therefore, the outlier detection plays an important role in the field of data mining. This paper proposes a hybrid distance-based outlier detection approach. It uses the average distance as neighborhood distance, and records the number of data objects within the neighborhood. Therefore, the average number of neighbors can be calculated. Using this average value as a threshold, the data set can be divided into two parts: non-outlier data set and candidate data set. Calculating the distances between a candidate object and its k-nearest neighbors can filter out outliers from the candidate data set. Experimental results show that the approach can effectively detect outliers.
Keywords
data mining; pattern clustering; average distance; candidate data set; candidate object; data mining; data objects; hybrid distance-based outlier detection approach; k-nearest neighbors; neighborhood distance; nonoutlier data set; performance degradation; Algorithm design and analysis; Data mining; Data models; Data preprocessing; Educational institutions; Euclidean distance; Partitioning algorithms; average distance; average number of neighbors; outlier detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems and Informatics (ICSAI), 2012 International Conference on
Conference_Location
Yantai
Print_ISBN
978-1-4673-0198-5
Type
conf
DOI
10.1109/ICSAI.2012.6223490
Filename
6223490
Link To Document