• DocumentCode
    3758986
  • Title

    A Text Clustering Algorithm Based on Find of Density Peaks

  • Author

    Peiyu Liu;Yingying Liu;Xiuyan Hou;Qingqing Li;Zhenfang Zhu

  • Author_Institution
    Shandong Yingcai Univ., Jinan, China
  • fYear
    2015
  • Firstpage
    348
  • Lastpage
    352
  • Abstract
    The text clustering is one of core problems in text mining and information retrieval field, clustering algorithm is divided into four categories: the partitioned clustering algorithm, the hierarchical clustering algorithm, density-based clustering algorithm, as well as intelligence clustering algorithm. However, most clustering algorithms cannot meet the demand of speed and self-adapting about text clustering. This paper proposed a text clustering algorithm based on find of density peaks. The algorithm was implemented by the calculation of text distance and density, which was in accordance with calculation of the text vector similarity. SVM was used to express text to obtain the vector mapping for the similarity calculation. The next work was the finding of the local density and the distance from points of higher density of each text, removing the noise points, selecting the cluster center. The remaining points were assigned into the cluster which its nearest cluster center represented. According to several sets of contrast experiment, the density-based text clustering has an advantage of reliability and robustness.
  • Keywords
    "Clustering algorithms","Partitioning algorithms","Clustering methods","Robustness","Text mining","Information retrieval","Algorithm design and analysis"
  • Publisher
    ieee
  • Conference_Titel
    Information Technology in Medicine and Education (ITME), 2015 7th International Conference on
  • Type

    conf

  • DOI
    10.1109/ITME.2015.103
  • Filename
    7429163