Title :
FAVC: Clustering Categorical Data Using the Frequency of Attribute Values Combinations
Author :
Do, Hee-Jung ; Kim, Jae-Yearn
Author_Institution :
Dept. of Ind. Eng., Hanyang Univ., Seoul
Abstract :
This paper proposes a new clustering algorithm for categorical data based on the frequency of attribute values combinations (FAVC). This algorithm finds all the combinations of attribute values in a record (which represent a subset of all the attribute values), and then groups the records using the frequency of these combinations. As the FAVC algorithm considers all the subsets of attribute values in a record, records in a cluster have not only similar attribute value sets but also strongly associated attribute values. The FAVC algorithm evaluated with real and synthetic data sets. The FAVC is shown better clustering results and superior running time in comparison with that of COOLCAT.
Keywords :
category theory; data handling; pattern clustering; COOLCAT; attribute value combination frequency; categorical data clustering; clustering algorithm; Clustering algorithms; Entropy; Euclidean distance; Frequency; Industrial engineering; Robustness;
Conference_Titel :
Innovative Computing Information and Control, 2008. ICICIC '08. 3rd International Conference on
Conference_Location :
Dalian, Liaoning
Print_ISBN :
978-0-7695-3161-8
Electronic_ISBN :
978-0-7695-3161-8
DOI :
10.1109/ICICIC.2008.275