• DocumentCode
    575010
  • Title

    Research on the parallel text clustering algorithm based on the semantic tree

  • Author

    Liu, Gangfeng ; Wang, Yunlan ; Zhao, Tianhai ; Li, Dongyang

  • Author_Institution
    Center for High Performance Comput., Northwestern Polytech. Univ., Xi´´an, China
  • fYear
    2011
  • fDate
    Nov. 29 2011-Dec. 1 2011
  • Firstpage
    400
  • Lastpage
    403
  • Abstract
    Since the semantic relationship between words is neglected, the results of the text clustering algorithms that only use word frequency are not precision. In this paper, a semantic tree based text clustering algorithm which is based on WordNet is proposed. In order to reduce the time complexity, we adopt parallel algorithm in multi-processes model. This parallel algorithm starts some processes at the same time. The master process undertakes the task of data partitioning, sending information, collecting information and clustering the result. The slave processes basically are in charge of statistics of word frequency, calculating the weights and getting hypernyms of some words according to the semantic tree. The results of experiment show that this algorithm is not only higher in precision, but also with lower time complexity.
  • Keywords
    computational complexity; parallel algorithms; pattern clustering; statistics; text analysis; trees (mathematics); word processing; WordNet; data partitioning; information collection; information sending; multiprocesses model; parallel algorithm; parallel text clustering algorithm; semantic tree; time complexity reduction; word frequency statistics; word hypernyms; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Parallel algorithms; Partitioning algorithms; Semantics; Parallel Algorithm; Semantic Tree; Text Clustering; WordNet;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Sciences and Convergence Information Technology (ICCIT), 2011 6th International Conference on
  • Conference_Location
    Seogwipo
  • Print_ISBN
    978-1-4577-0472-7
  • Type

    conf

  • Filename
    6316646