DocumentCode :
685481
Title :
Finding User Clusters in Sina Microblog
Author :
Kang Pei ; Kai Niu ; ZhiQiang He ; Xuan He
Author_Institution :
Key Lab. of Universal Wireless Commun., Beijing Univ. of Posts & Telecommun., Beijing, China
Volume :
1
fYear :
2013
fDate :
28-29 Oct. 2013
Firstpage :
406
Lastpage :
409
Abstract :
Sina microblog has been a very popular social microblog service in recent years. However it\´s difficult to analyze the network structure of Sina microblog because of the huge amount of users. The emergence of cloud computing gives us a new approach to analyze large-scale social networks. Hadoop is a widely used cloud computing platform, several clustering algorithms such as K-means and Canopy have already been implemented on it. However, the initial cluster centers of K-means are hard to select. Canopy provides a way to choose initial centers, but it is not suitable for very large data sets, and both traditional K-means and Canopy K-means converge very slowly. This paper proposes an improved method to cluster microblog users based on their relationship. We name our method "Weight Partitioned Canopy K-means" (WPCK), implement it on Hadoop cluster, and test it along with existing methods. Experimental results show that WPCK can reduce the number of iterations by about 1/3 of traditional K-means and Canopy K-means, while their performance are almost the same.
Keywords :
cloud computing; pattern clustering; public domain software; social networking (online); Hadoop cluster; Sina microblog user clustering; WPCK; canopy k-means clustering algorithm; cloud computing; cluster centers; large-scale social networks; network structure analysis; social microblog service; weight partitioned canopy k-means; Cloud computing; Clustering algorithms; Communities; Partitioning algorithms; Twitter; Vectors; Clustering; Hadoop; Microblog; Social Network;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Design (ISCID), 2013 Sixth International Symposium on
Conference_Location :
Hangzhou
Type :
conf
DOI :
10.1109/ISCID.2013.107
Filename :
6805020
Link To Document :
بازگشت