• DocumentCode
    13827
  • Title

    GDCluster: A General Decentralized Clustering Algorithm

  • Author

    Mashayekhi, Hoda ; Habibi, Jafar ; Khalafbeigi, Tania ; Voulgaris, Spyros ; van Steen, Maarten

  • Author_Institution
    Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran, Iran
  • Volume
    27
  • Issue
    7
  • fYear
    2015
  • fDate
    July 1 2015
  • Firstpage
    1892
  • Lastpage
    1905
  • Abstract
    In many popular applications like peer-to-peer systems, large amounts of data are distributed among multiple sources. Analysis of this data and identifying clusters is challenging due to processing, storage, and transmission costs. In this paper, we propose GDCluster, a general fully decentralized clustering method, which is capable of clustering dynamic and distributed data sets. Nodes continuously cooperate through decentralized gossip-based communication to maintain summarized views of the data set. We customize GDCluster for execution of the partition-based and density-based clustering methods on the summarized views, and also offer enhancements to the basic algorithm. Coping with dynamic data is made possible by gradually adapting the clustering model. Our experimental evaluations show that GDCluster can discover the clusters efficiently with scalable transmission cost, and also expose its supremacy in comparison to the popular method LSP2P.
  • Keywords
    costing; data analysis; distributed processing; pattern clustering; GDCluster; LSP2P; cluster identification; data analysis; decentralized gossip-based communication; density-based clustering method; distributed data set clustering; dynamic data set clustering; general decentralized clustering algorithm; partition-based clustering method; processing cost; storage cost; transmission cost; Approximation algorithms; Clustering algorithms; Data models; Distributed databases; Partitioning algorithms; Peer-to-peer computing; Vectors; Clustering; Density-based Clustering; Distributed Systems; Distributed systems; Dynamic System; Partition-based Clustering; clustering; density-based clustering; dynamic system; partition-based clustering;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2015.2391123
  • Filename
    7006742