DocumentCode :
560327
Title :
Text Clustering using Frequent Contextual Termset
Author :
Akhriza, Tubagus Mohammad ; Ma, Yinghua ; Li, Jianhua
Author_Institution :
Sch. of Commun. & Inf. Syst., Shanghai Jiao Tong Univ., Shanghai, China
Volume :
1
fYear :
2011
fDate :
26-27 Nov. 2011
Firstpage :
339
Lastpage :
342
Abstract :
We introduce frequent contextual term set (FCT) as an alternative concept of term set construction for text clustering which is produced from the interestingness of documents. Comparing to state-of-art term set, the proposed approach has some advantages: (1) more efficient in term set production (2) more effective in storing the vocabulary amongst documents which express the context amongst documents and (3) more suitable to discover specificity of dataset. To utilize FCT we also introduce frequent contextual term set based hierarchical clustering (FCTHC) which adopts the concept of cancroids in K-means with some main differences. The experiment shows that FCT is the correct pattern to perform text clustering and FCTHC provides flexible approach in clusters construction.
Keywords :
pattern clustering; text analysis; vocabulary; cancroid concept; dataset specificity discovery; document interestingness; frequent contextual term set based hierarchical clustering; k-means; term set construction; term set production; text clustering; vocabulary storage; Clustering algorithms; Context; Data mining; Equations; Itemsets; Merging; Production; Frequent Contextual Termset; Frequent Itemset; Text clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Management, Innovation Management and Industrial Engineering (ICIII), 2011 International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-61284-450-3
Type :
conf
DOI :
10.1109/ICIII.2011.86
Filename :
6115455
Link To Document :
بازگشت