• DocumentCode
    2192507
  • Title

    P2PKMM: A Hybrid Clustering Algorithm over P2P Network

  • Author

    Deng, Zhongjun ; Song, Wei ; Zheng, Xuefeng

  • Author_Institution
    Sch. of Inf. Eng., Univ. of Sci. & Technol. Beijing, Beijing, China
  • fYear
    2010
  • fDate
    2-4 April 2010
  • Firstpage
    450
  • Lastpage
    454
  • Abstract
    Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a wide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., Peer-to-Peer (P2P) network. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a hybrid clustering algorithm, called P2PKMM. In each node, the K-medoids algorithm is used. Thus, the local noise can be avoided greatly. Meanwhile, the K-means method is used between different nodes, which can be calculated easily over distributed environment. The proposed algorithm takes a completely decentralized approach, where peers (nodes) only synchronize with their immediate topological neighbors in the underlying communication network. Furthermore, this algorithm can easily adapt to dynamic P2P network where existing nodes drop out and new nodes join in during the execution of the algorithm and the data in network changes. Experimental results show P2PKMM can not only produce highly accurate clustering results, but also with high scalability.
  • Keywords
    data mining; pattern clustering; peer-to-peer computing; K-medoids algorithm; P2P network; P2PKMM; data mining; hybrid clustering algorithm; peer-to-peer network; Centralized control; Clustering algorithms; Communication networks; Computer networks; Control systems; Data mining; Large-scale systems; Peer to peer computing; Size control; Working environment noise; K-means clustering; K-medoids clustering; data mining; peer to peer network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Information Technology and Security Informatics (IITSI), 2010 Third International Symposium on
  • Conference_Location
    Jinggangshan
  • Print_ISBN
    978-1-4244-6730-3
  • Electronic_ISBN
    978-1-4244-6743-3
  • Type

    conf

  • DOI
    10.1109/IITSI.2010.9
  • Filename
    5453617