DocumentCode
3307274
Title
Study on frequent term set-based hierarchical clustering algorithm
Author
Huiying Wang ; Xiangwei Liu
Author_Institution
Sch. of Public Adm., Univ. of Int. Bus. & Econ., Beijing, China
Volume
2
fYear
2011
fDate
26-28 July 2011
Firstpage
1182
Lastpage
1186
Abstract
This paper, we present a text-clustering algorithm of Frequent Term Set-based Clustering (FTSC), which uses frequent term sets for texts clustering. This algorithm can reduce the dimensionality of the text data efficiently, thus it can improve accurate rate and running speed of the clustering algorithm. The results of clustering texts by the FTSC algorithm cannot reflect the overlap of texts´ classes. Based on the FTSC algorithm, its improved algorithm-Frequent Term Set-based Hierarchical Clustering algorithm (FTSHC) is given. This algorithm can determine the overlap of texts´ classes by the overlap of frequent words sets, and provide an understandable description of the discovered clusters by the frequent terms sets. The experiment results prove that FTSC and FTSHC algorithms are more efficient than K-Means algorithm in the performance of clustering.
Keywords
pattern clustering; text analysis; FTSC algorithm; dimensionality reduction; frequent term set based hierarchical clustering algorithm; k-means algorithm; text clustering algorithm; Algorithm design and analysis; Clustering algorithms; Educational institutions; Entropy; Feature extraction; Itemsets; FTSC; Frequent Term; Text Clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-180-9
Type
conf
DOI
10.1109/FSKD.2011.6019686
Filename
6019686
Link To Document