DocumentCode
402859
Title
TCBLSA: a new method of text clustering
Author
Xu, Jian-Suo ; Wang, Zheng-Ou
Author_Institution
Inst. of Syst. Eng., Tianjin Univ., China
Volume
1
fYear
2003
fDate
2-5 Nov. 2003
Firstpage
63
Abstract
This paper presents a new method of text clustering based on the theory of latent semantic analysis (LSA) called TCBLSA method. The vector space model (VSM) of term weight is constructed by the theory of LSA and the TF.IDF method. The present method decreases the dimension of vector, and eliminates disadvantageous factors in the VSM. Furthermore, the method advances the speed and precision of text clustering. Through analyzing experimental data, we demonstrate that the TCBLSA method is effective and feasible for text clustering.
Keywords
singular value decomposition; text analysis; latent semantic analysis theory; singular value decomposition; term weight; text clustering; vector space model; Clustering methods; Data analysis; Frequency; Functional analysis; Machine learning; Matrix decomposition; Singular value decomposition; Statistical analysis; Systems engineering and theory; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN
0-7803-8131-9
Type
conf
DOI
10.1109/ICMLC.2003.1264443
Filename
1264443
Link To Document