• DocumentCode
    2118978
  • Title

    A document clustering algorithm based on improved landmark semidefinite embedding

  • Author

    Wang, Hui ; Qin, Hua ; Ding, Li-duo ; Hui, Wang

  • Author_Institution
    School of Computer and Electronic Information, Guangxi University, Nanning, China
  • fYear
    2010
  • fDate
    4-6 Dec. 2010
  • Firstpage
    4827
  • Lastpage
    4830
  • Abstract
    The document space is generally of high dimensionality, and clustering in such a high dimensional space is often infeasible due to the curse of dimensionality. In this paper, a novel document clustering method which based on improved landmark semidefinite embedding (lSDE) is proposed. Based on the general lSDE, the point selection rules is modified by Max-min distance algorithm, with a view to ensuring the stability of algorithm. By using the improved lSDE, the documents can be projected into a lower dimension kernel space in which redundant information was filtered, and the documents related to the same semantic are close to each other. On this low-dimensional representation, the processed document data was clustered by kernel K-means. Experimental results show that the new clustering algorithm gives better performance than several advanced clustering methods.
  • Keywords
    Clustering algorithms; Computers; Educational institutions; Kernel; Principal component analysis; Programming; Semantics; Max-min distance algorithm; kernel K-means; nonlinear dimensionality reduction; text clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science and Engineering (ICISE), 2010 2nd International Conference on
  • Conference_Location
    Hangzhou, China
  • Print_ISBN
    978-1-4244-7616-9
  • Type

    conf

  • DOI
    10.1109/ICISE.2010.5690075
  • Filename
    5690075