• DocumentCode
    402859
  • Title

    TCBLSA: a new method of text clustering

  • Author

    Xu, Jian-Suo ; Wang, Zheng-Ou

  • Author_Institution
    Inst. of Syst. Eng., Tianjin Univ., China
  • Volume
    1
  • fYear
    2003
  • fDate
    2-5 Nov. 2003
  • Firstpage
    63
  • Abstract
    This paper presents a new method of text clustering based on the theory of latent semantic analysis (LSA) called TCBLSA method. The vector space model (VSM) of term weight is constructed by the theory of LSA and the TF.IDF method. The present method decreases the dimension of vector, and eliminates disadvantageous factors in the VSM. Furthermore, the method advances the speed and precision of text clustering. Through analyzing experimental data, we demonstrate that the TCBLSA method is effective and feasible for text clustering.
  • Keywords
    singular value decomposition; text analysis; latent semantic analysis theory; singular value decomposition; term weight; text clustering; vector space model; Clustering methods; Data analysis; Frequency; Functional analysis; Machine learning; Matrix decomposition; Singular value decomposition; Statistical analysis; Systems engineering and theory; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2003 International Conference on
  • Print_ISBN
    0-7803-8131-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2003.1264443
  • Filename
    1264443