DocumentCode :
3494457
Title :
A Novel Text Clustering Method Based on TGSOM and Fuzzy K-Means
Author :
Hu, Jinzhu ; Xiong, Chunxiu ; Shu, Jiangbo ; Zhou, Xing ; Zhu, Jun
Author_Institution :
Dept. of Comput. Sci., HuaZhong Normal Univ., Wuhan
Volume :
1
fYear :
2009
fDate :
7-8 March 2009
Firstpage :
26
Lastpage :
30
Abstract :
According to the high-dimensional sparse features of the storage of the textual document, and defects existing in the clustering methods which have already studied by now and some other problems, an effective text clustering approach (short for TGSOM-FS-FKM) based on tree-structured growing self-organizing maps (TGSOM) and fuzzy k-means (FKM) is proposed. It firstly makes preprocess of texts, and filter the majority of noisy words by using unsupervised feature selection method. Then it used TGSOM to execute the first clustering to get the rough classification of texts, and to get the initial clustering number and each textpsilas category. And then introduced LSA theory to improve the precision of clustering and reduce the dimension of feature vector. After that it used TGSOM to execute the second clustering to get the moreprecise clustering result, and used supervised feature selection method to select feature items. Finally, it used FKM to cluster the result set. In the experiment, it remained the same number of feature items.Experimental results indicate that TGSOM-FS-FKM clustering excels to other clustering method such as DSOM-FS-FCM, and the precision is better than DSOM-FCM, DFKCN and FDMFC clustering.
Keywords :
fuzzy set theory; pattern clustering; self-organising feature maps; text analysis; DFKCN; DSOM-FCM; DSOM-FS-FCM; FDMFC; TGSOM; fuzzy k-means; text clustering method; textual document; tree-structured growing self-organizing maps; unsupervised feature selection method; Clustering methods; Computer science; Computer science education; Educational technology; Filters; Functional analysis; Neurons; Self organizing feature maps; Text categorization; Text processing; -tree-structured growing self-organizing maps; Fuzzy K-Means; text clustering; text clustering flow model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Education Technology and Computer Science, 2009. ETCS '09. First International Workshop on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-1-4244-3581-4
Type :
conf
DOI :
10.1109/ETCS.2009.14
Filename :
4958717
Link To Document :
بازگشت