• DocumentCode
    441868
  • Title

    Application of LSA space´s dimension character in document multi-hierarchy clustering

  • Author

    Liu, Yun-Feng ; Qi, Huan ; Hu, Xiang-En ; Cai, Zhi-Qiang ; Dai, Jian-Min ; Zhu, Li

  • Author_Institution
    Inst. of Syst. Eng., Huazhong Univ. of Sci. & Technol., Hubei, China
  • Volume
    4
  • fYear
    2005
  • fDate
    18-21 Aug. 2005
  • Firstpage
    2384
  • Abstract
    In LSA space, dimensions corresponding to bigger singular values reflect the general concept of language elements, while dimensions corresponding to smaller singular values reflect particular concept of language elements. On this basis, different dimensions of LSA space are adopted for document clustering under various concept granularities. In addition, in the LSA-based algorithm of document clustering, better clustering results are obtained by taking the row vectors of document self-indexing matrix as the objects to be clustered, instead of the document vectors with low dimensionality.
  • Keywords
    data mining; document handling; indexing; pattern clustering; LSA space dimension character; document multihierarchy clustering; document self-indexing matrix; document vectors; language elements; latent semantic analysis; Clustering algorithms; Computer aided instruction; Frequency; Intelligent systems; Matrix decomposition; Natural languages; Singular value decomposition; Space technology; Systems engineering and theory; Text analysis; Concept Granularity; Document Multi-hierarchy Clustering; Document Self-indexing Matrix; Latent Semantic Analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
  • Conference_Location
    Guangzhou, China
  • Print_ISBN
    0-7803-9091-1
  • Type

    conf

  • DOI
    10.1109/ICMLC.2005.1527343
  • Filename
    1527343