• DocumentCode
    685481
  • Title

    Finding User Clusters in Sina Microblog

  • Author

    Kang Pei ; Kai Niu ; ZhiQiang He ; Xuan He

  • Author_Institution
    Key Lab. of Universal Wireless Commun., Beijing Univ. of Posts & Telecommun., Beijing, China
  • Volume
    1
  • fYear
    2013
  • fDate
    28-29 Oct. 2013
  • Firstpage
    406
  • Lastpage
    409
  • Abstract
    Sina microblog has been a very popular social microblog service in recent years. However it\´s difficult to analyze the network structure of Sina microblog because of the huge amount of users. The emergence of cloud computing gives us a new approach to analyze large-scale social networks. Hadoop is a widely used cloud computing platform, several clustering algorithms such as K-means and Canopy have already been implemented on it. However, the initial cluster centers of K-means are hard to select. Canopy provides a way to choose initial centers, but it is not suitable for very large data sets, and both traditional K-means and Canopy K-means converge very slowly. This paper proposes an improved method to cluster microblog users based on their relationship. We name our method "Weight Partitioned Canopy K-means" (WPCK), implement it on Hadoop cluster, and test it along with existing methods. Experimental results show that WPCK can reduce the number of iterations by about 1/3 of traditional K-means and Canopy K-means, while their performance are almost the same.
  • Keywords
    cloud computing; pattern clustering; public domain software; social networking (online); Hadoop cluster; Sina microblog user clustering; WPCK; canopy k-means clustering algorithm; cloud computing; cluster centers; large-scale social networks; network structure analysis; social microblog service; weight partitioned canopy k-means; Cloud computing; Clustering algorithms; Communities; Partitioning algorithms; Twitter; Vectors; Clustering; Hadoop; Microblog; Social Network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Design (ISCID), 2013 Sixth International Symposium on
  • Conference_Location
    Hangzhou
  • Type

    conf

  • DOI
    10.1109/ISCID.2013.107
  • Filename
    6805020