• DocumentCode
    243608
  • Title

    EgoLP: Fast and Distributed Community Detection in Billion-Node Social Networks

  • Author

    Buzun, Nazar ; Korshunov, Anton ; Avanesov, Valeriy ; Filonenko, Ilya ; Kozlov, Ilya ; Turdakov, Denis ; Hangkyu Kim

  • Author_Institution
    Inst. for Syst. Program., Fed. Agency of Sci. Organizations, Moscow, Russia
  • fYear
    2014
  • fDate
    14-14 Dec. 2014
  • Firstpage
    533
  • Lastpage
    540
  • Abstract
    Community structure is one of the most important and characteristic features of social networks. Numerous methods for discovering implicit user communities from a social graph of users have been proposed in recent years. However, most of them have performance and scalability issues which make them hardly applicable to population-wide analysis of modern social networks (billions of users and growing). In this paper we present EgoLP - an efficient and fully distributed method for social community detection. The method is based on propagating community labels through the network with the help of friendship groups of individual users. Experimental evaluation of Apache Spark implementation of the method showed that it outperforms some state-of-the-art methods in terms of a) similarity of extracted communities to the reference ones from synthetic networks, b) precision of user attributes prediction in Facebook based solely on community memberships, c) likelihood of the discovered community structure according to the proposed generative model. At the same time, the method retains near-linear complexity in the number of edges and is thus applicable to social graphs of up to 109 users.
  • Keywords
    computational complexity; graph theory; social networking (online); Apache Spark implementation; EgoLP; Facebook; billion-node social networks; community label propagation; community memberships; community structure; distributed community detection; near-linear complexity; social graphs; synthetic networks; user attribute prediction; Accuracy; Communities; Image edge detection; Receivers; Scalability; Social network services; Sparks; Community detection; distributed algorithms; graph clustering; social networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
  • Conference_Location
    Shenzhen
  • Print_ISBN
    978-1-4799-4275-6
  • Type

    conf

  • DOI
    10.1109/ICDMW.2014.158
  • Filename
    7022642