DocumentCode
515365
Title
A proposed outliers identification algorithm for categorical data sets
Author
Taha, Ayman ; Hegazy, Osman M.
Author_Institution
Fac. of Comput. & Inf., Cairo Univ., Cairo, Egypt
fYear
2010
fDate
28-30 March 2010
Firstpage
1
Lastpage
5
Abstract
Outliers are a minority of observations that are inconsistent with the pattern suggested by the majority of observations. Outliers identification algorithms for categorical data sets face many limitation because measuring distance is not common in categorical data. In this paper, we propose a new unsupervised outliers identification method in categorical data sets. In contrast to other outliers identification methods, the proposed method considers number of categories inside categorical variables. Experimental results show that the proposed method has a comparable performance results with respect to other outliers identification methods in performance.
Keywords
data mining; categorical data sets; categorical variables; data mining; outliers identification algorithm; Application software; Computer errors; Data mining; Detection algorithms; Frequency; Phase detection; Spatial databases; Supervised learning; Testing; Unsupervised learning; Categorical Data; Data Mining; Outliers Detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Informatics and Systems (INFOS), 2010 The 7th International Conference on
Conference_Location
Cairo
Print_ISBN
978-1-4244-5828-8
Type
conf
Filename
5461759
Link To Document