DocumentCode :
3186660
Title :
GCA: An algorithm based on the gower similarity for clustering of categorical variables
Author :
dos Santos, T.R.L. ; Zarate, Luis E.
fYear :
2012
fDate :
1-5 Oct. 2012
Firstpage :
1
Lastpage :
6
Abstract :
The data clustering is a technique used to make groups of objects present similar characteristics from a database. These databases may contain different variable types (numeric, categorical, scalar, binary, etc..), but categorical variables such as become a challenge clustering because lack of natural ordering. With this lack there is a big deficiency of tools and algorithms for clustering databases with categorical variables. The present work propose a new clustering algorithm for categorical data called GCA (Gower Clustering Algorithm) based in combination of algorithm TaxMap and measure of similarity coefficient of Gower. The GCA algorithm was compared with two others algorithms (clope and FarthestFirst) and through a brief statistical analysis, GCA had a very significant performance to contribute with deficiency cited.
Keywords :
pattern clustering; statistical analysis; Gower clustering algorithm; Gower similarity coefficient measurement; TaxMap algorithm; categorical variable data clustering technique; clustering databases; statistical analysis; Algorithm design and analysis; Clustering algorithms; Data mining; Databases; Electronic mail; Machine learning; Statistical analysis; Algorithm; categorical; clustering; similarity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatica (CLEI), 2012 XXXVIII Conferencia Latinoamericana En
Conference_Location :
Medellin
Print_ISBN :
978-1-4673-0794-9
Type :
conf
DOI :
10.1109/CLEI.2012.6427180
Filename :
6427180
Link To Document :
بازگشت