• DocumentCode
    2963879
  • Title

    Extending the Growing Hierarchal SOM for clustering documents in graphs domain

  • Author

    Hussin, Mahmoud F. ; Farra, Mahmoud R. ; El-Sonbaty, Yasser

  • fYear
    2008
  • fDate
    1-8 June 2008
  • Firstpage
    4028
  • Lastpage
    4035
  • Abstract
    The growing hierarchal self-organizing map (GHSOM) is the most efficient model among the variants of SOM. It is used successfully in document clustering and in various pattern recognition applications effectively. The main constraint that limits the implementation of this model and all the other variants of SOM models is that they work only with vector space model (VSM). In this paper, we extend the GHSOM to work in the graph domain to enhance the quality of clusters. Specifically, we represent the documents by graphs and then cluster those documents by using a new algorithm G-GHSOM: graph-based growing merarchal SOM after modifying its operations to work with the graph instead of vector space. We have tested the G-GHSOM on two different document collections using three different measures for evaluating clustering quality. The experimental results of the proposed G-GHSOM show an improvement in terms of clustering quality compared to classical GHSOM.
  • Keywords
    document handling; graph theory; self-organising feature maps; VSM; clustering documents; document clustering; document collections; graph-based growing merarchal; graphs domain; growing hierarchal SOM; growing hierarchal self-organizing map; pattern recognition; vector space model; Artificial neural networks; Clustering algorithms; Clustering methods; Neurons; Pattern recognition; Search engines; Taxonomy; Testing; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1820-6
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2008.4634377
  • Filename
    4634377