DocumentCode :
515365
Title :
A proposed outliers identification algorithm for categorical data sets
Author :
Taha, Ayman ; Hegazy, Osman M.
Author_Institution :
Fac. of Comput. & Inf., Cairo Univ., Cairo, Egypt
fYear :
2010
fDate :
28-30 March 2010
Firstpage :
1
Lastpage :
5
Abstract :
Outliers are a minority of observations that are inconsistent with the pattern suggested by the majority of observations. Outliers identification algorithms for categorical data sets face many limitation because measuring distance is not common in categorical data. In this paper, we propose a new unsupervised outliers identification method in categorical data sets. In contrast to other outliers identification methods, the proposed method considers number of categories inside categorical variables. Experimental results show that the proposed method has a comparable performance results with respect to other outliers identification methods in performance.
Keywords :
data mining; categorical data sets; categorical variables; data mining; outliers identification algorithm; Application software; Computer errors; Data mining; Detection algorithms; Frequency; Phase detection; Spatial databases; Supervised learning; Testing; Unsupervised learning; Categorical Data; Data Mining; Outliers Detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatics and Systems (INFOS), 2010 The 7th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4244-5828-8
Type :
conf
Filename :
5461759
Link To Document :
بازگشت