• DocumentCode
    1942736
  • Title

    Building an Adaptive Hierarchy of Clusters for Text Data

  • Author

    Chen, Shan ; Alahakoon, Damminda ; Indrawan, Maria

  • Author_Institution
    Fac. of Inf. Technol., Monash Univ., Clayton, Vic.
  • Volume
    2
  • fYear
    2005
  • fDate
    28-30 Nov. 2005
  • Firstpage
    7
  • Lastpage
    12
  • Abstract
    Text clustering has been recognized as an important component in Web-based applications. Clustering data on a hierarchical structure enables exploring data on different levels of granularity, providing a more intuitive view that is close to the way humans view the world. Self-organizing map (SOM) based models have been found to have certain advantages for clustering sizeable text data. However, current existing approaches lack in providing an adaptive hierarchical structure within in a single model. This paper proposes an unsupervised hierarchical clustering approach based on the growing self-organizing map (GSOM). By utilizing GSOM´s spread factor, our approach offers an adaptive architecture with the capability of detecting necessary layers to form a hierarchy, avoiding a number of issues that a traditional top-down or bottom-up hierarchical clustering approach often encounter. Experiment has shown that this approach has the potential for efficiently clustering heterogeneous text data
  • Keywords
    Internet; pattern clustering; self-organising feature maps; text analysis; GSOM; Web-based application; adaptive hierarchical data structure; self-organizing map; text clustering; unsupervised hierarchical clustering; Data analysis; Humans; Information technology; Network topology; Neural networks; Text recognition; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence for Modelling, Control and Automation, 2005 and International Conference on Intelligent Agents, Web Technologies and Internet Commerce, International Conference on
  • Conference_Location
    Vienna
  • Print_ISBN
    0-7695-2504-0
  • Type

    conf

  • DOI
    10.1109/CIMCA.2005.1631437
  • Filename
    1631437