DocumentCode :
3438369
Title :
Decentralized K-Means Using Randomized Gossip Protocols for Clustering Large Datasets
Author :
Fellus, Jerome ; Picard, David ; Gosselin, Philippe-Henri
Author_Institution :
ETIS, ENSEA/Univ. de Cergy-Pontoise, Cergy, France
fYear :
2013
fDate :
7-10 Dec. 2013
Firstpage :
599
Lastpage :
606
Abstract :
In this paper, we consider the clustering of very large datasets distributed over a network of computational units using a decentralized K-means algorithm. To obtain the same codebook at each node of the network, we use a randomized gossip aggregation protocol where only small messages are exchanged. We theoretically show the equivalence of the algorithm with a centralized K-means, provided a bound on the number of messages each node has to send is met. We provide experiments showing that the consensus is reached for a number of messages consistent with the bound, but also for a smaller number of messages, albeit with a less smooth evolution of the objective function.
Keywords :
distributed processing; optimisation; pattern clustering; randomised algorithms; centralized k-means algorithm; codebook; computational units; decentralized k-means algorithm; message exchange; network node; objective function; randomized gossip aggregation protocol; very-large dataset clustering; Clustering algorithms; Convergence; Data models; Optimization; Partitioning algorithms; Protocols; Vectors; Distributed clustering; randomized gossip protocols;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4799-3143-9
Type :
conf
DOI :
10.1109/ICDMW.2013.58
Filename :
6753975
Link To Document :
بازگشت