DocumentCode :
3059272
Title :
Clustering Categorical Data Based on Maximal Frequent Itemsets
Author :
Dadong Yu ; Dongbo Liu
Author_Institution :
Nat. Univ. of Defense Technol., Changsha
fYear :
2007
fDate :
13-15 Dec. 2007
Firstpage :
93
Lastpage :
97
Abstract :
Clustering categorical data received more attention since recent years, but several aspects of the existing algorithms, such as the interpretabilities of found clusters, the impact of data selection orders, are not well solved. A novel categorical data clustering algorithm called CLUBMIS is proposed in this paper, which can effectively find the interesting clusters. In addition, the clusters can be easily interpreted by the maximal frequent itemsets used in the clustering process. Different from most of the hierarchical clustering algorithm, CLUBMIS clusters datasets based on the summarized information, i.e. maximal frequent itemsets, thus it eliminates the effect of different data selection order.
Keywords :
data handling; pattern clustering; CLUBMIS; categorical data clustering; maximal frequent itemsets; Application software; Clustering algorithms; Computer science; Cost function; Data engineering; Educational institutions; Itemsets; Machine learning; Machine learning algorithms; Systems engineering and theory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location :
Cincinnati, OH
Print_ISBN :
978-0-7695-3069-7
Type :
conf
DOI :
10.1109/ICMLA.2007.11
Filename :
4457214
Link To Document :
بازگشت