• DocumentCode
    3104880
  • Title

    A Dynamic SOM Algorithm for Clustering Large-Scale Document Collection

  • Author

    Luo, Kegang ; Liu, Yuanchao ; Wang, Xiaolong

  • fYear
    2007
  • fDate
    22-24 Aug. 2007
  • Firstpage
    15
  • Lastpage
    20
  • Abstract
    A dynamic SOM algorithm of incremental gradient descent to cluster large-scale document collection is proposed in this paper. In comparison with other SOM algorithms (e.g. GHSOM), the size of output layer in our algorithm can be gradually reduced and dynamically by inserting suitable number of neurons, thus the number of underutilized neurons can be reduced greatly and the training results of this algorithm can fully represent the distribution of topics in document collection. In addition, when using this algorithm to cluster large-scale documents the computation cost can also be shortened remarkably. The overused neurons have been split again to optimize the cluster results further. A good result of cluster can be gained. Experiments results proved the effectiveness of this algorithm.
  • Keywords
    Clustering algorithms; Clustering methods; Computational efficiency; Computer science; Heuristic algorithms; Information technology; Large-scale systems; Navigation; Neurons; Self organizing feature maps; Text clusteringincremental gradient descentdynamic SOMoverused neuronsunderutilized neurons;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Language Processing and Web Information Technology, 2007. ALPIT 2007. Sixth International Conference on
  • Conference_Location
    Luoyang, Henan, China
  • Print_ISBN
    978-0-7695-2930-1
  • Type

    conf

  • DOI
    10.1109/ALPIT.2007.55
  • Filename
    4460608