• DocumentCode
    2774382
  • Title

    A Self-Organizing Map Based Approach for Document Clustering and Visualization

  • Author

    Yen, Gary G. ; Wu, Zheng

  • Author_Institution
    Oklahoma State Univ., Stillwater
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    3279
  • Lastpage
    3286
  • Abstract
    In this paper, the clustering and visualization capabilities of the SOM, specifically tailored for the analysis of textual data, are reviewed and further developed. A novel clustering and visualization approach is proposed for the task of textual data mining. The proposed approach first transforms the document space into a multi-dimensional vector space by means of citation patterns. An intuitive and effective projection method, namely the ranked centroid projection (RCP), is then applied in conjunction with a dynamic SOM model, the growing hierarchical self-organizing map, which automatically produces document maps with various levels of details. The RCP is used both as a data analysis tool as well as a direct interface to the data. We also extend the RCP to address the problem of the incremental clustering of dynamic document collections. In a set of simulations, the proposed approach is applied to a synthetic data set and two real-world scientific document collections, to demonstrate its applicability.
  • Keywords
    data analysis; data mining; data visualisation; document handling; pattern clustering; self-organising feature maps; citation patterns; document clustering; document visualization; growing hierarchical self-organizing map; multi-dimensional vector space; ranked centroid projection; scientific document collections; textual data analysis; textual data mining; Clustering algorithms; Clustering methods; Data analysis; Data engineering; Data mining; Data visualization; Displays; Neurons; Pattern recognition; Prototypes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2006. IJCNN '06. International Joint Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    0-7803-9490-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2006.247324
  • Filename
    1716546