• DocumentCode
    2579829
  • Title

    A Hierarchical Text Clustering Algorithm with Cognitive Situation Dimensions

  • Author

    Guo, Yi ; Shao, Zhiqing ; Hua, Nan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., East China Univ. of Sci. & Technol., Shanghai
  • fYear
    2009
  • fDate
    23-25 Jan. 2009
  • Firstpage
    11
  • Lastpage
    14
  • Abstract
    Text clustering is an important task of text mining. The purpose of text clustering is grouping similar text documents together efficiently to meet human interests in information searching and understanding. The procedure of clustering should involve a cognitive process of text understanding or comprehension.This paper introduces an innovative research effort, CogHTC, a hierarchical text clustering algorithm, inspired by cognitive situation models. CogHTC extracts representative features from four elaborately selected cognitive situation dimensions with consideration of the clustering efficiency. The experimental results testified good performance of CogHTC, and revealed that the clustering results of CogHTC are class or domain sensitive, and CogHTC performed better on cross-class clustering than inner- class clustering.
  • Keywords
    cognitive systems; data mining; pattern clustering; query formulation; text analysis; cognitive situation dimensions; cross-class clustering; hierarchical text clustering; information searching; information understanding; inner-class clustering; text documents grouping; text mining; Clustering algorithms; Computer science; Data engineering; Data mining; Feature extraction; Frequency; Humans; Knowledge engineering; Testing; Text mining; Cognitive; Hierarchical; Situation Dimensions; Text Clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Knowledge Discovery and Data Mining, 2009. WKDD 2009. Second International Workshop on
  • Conference_Location
    Moscow
  • Print_ISBN
    978-0-7695-3543-2
  • Type

    conf

  • DOI
    10.1109/WKDD.2009.17
  • Filename
    4771866