• DocumentCode
    2773178
  • Title

    Effective Criterion Functions for Efficient Agglomerative Clustering on Very Large Networks

  • Author

    Wang, Yang ; An, Mingyuan

  • Author_Institution
    Key Lab. of Comput. Syst. & Archit., Grad. Univ. of Chinese Acad. of Sci., Beijing, China
  • fYear
    2009
  • fDate
    6-9 Dec. 2009
  • Firstpage
    1040
  • Lastpage
    1045
  • Abstract
    As the agglomerative clustering algorithm is widely used in data mining, image processing, bioinformatics and pattern recognition. it has attracted great interests from both academical and industrial communities. However, existing studies neglect the decisive factor of the efficiency of the agglomerative clustering algorithm for large complex networks and usually use criterion functions which lead to inefficiency. In this paper, we propose three effective criterion functions for improving performance of agglomerative clustering algorithm. We note that clustering efficiency is determined by two factors: a) the number of neighbors of two merged clusters in each merge step; b) the number of neighbors shared by the two clusters. Based on these observations, we propose a framework for designing criterion functions in order to efficiently find clusters in very large networks. We devise three criterion functions that can effectively control the number of neighbors of clusters, and they can efficiently produce high-quality clusters. We have implemented our method and compared with existing studies on real networks, and our method outperforms state-of-the-art approaches significantly on large networks.
  • Keywords
    complex networks; network theory (graphs); bioinformatics; data mining; effective criterion functions; efficient agglomerative clustering; image processing; large complex networks; pattern recognition; very large networks; Bioinformatics; Clustering algorithms; Computer architecture; Computer industry; Computer networks; Content addressable storage; Data mining; Image processing; Mining industry; Pattern recognition; agglomerative clustering; criterion function; graph;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
  • Conference_Location
    Miami, FL
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-5242-2
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2009.91
  • Filename
    5360353