• DocumentCode
    2243333
  • Title

    Incremental clustering algorithm based on phrase-semantic similarity histogram

  • Author

    Gad, Walaa K. ; Kamel, Mohamed S.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Waterloo, Waterloo, ON, Canada
  • Volume
    4
  • fYear
    2010
  • fDate
    11-14 July 2010
  • Firstpage
    2088
  • Lastpage
    2093
  • Abstract
    Incremental document clustering is an important key in organizing, searching, and browsing large datasets. Although, many incremental document clustering methods have been proposed, they do not focus on linguistic and semantic properties of the text Incremental clustering algorithms are preferred to traditional clustering techniques with the advent of online publishing in the World Wide Web. In this paper, an incremental document clustering algorithm is introduced. The proposed algorithm integrates the text semantic to the incremental clustering process. The clusters are represented using semantic histogram which measures the distribution of semantic similarities within each cluster. Experimental results show that the proposed algorithm has a promising clustering performance compared to standard clustering methods.
  • Keywords
    data mining; document handling; natural language processing; pattern clustering; incremental document clustering; phrase-semantic similarity histogram; text semantic; Histograms; Semantics; Ontology; WordNet; incremental document clustering; semantic histogram; semantic similarity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-1-4244-6526-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2010.5580499
  • Filename
    5580499