• DocumentCode
    2840436
  • Title

    Clustering Streaming Graphs

  • Author

    Eldawy, Ahmed ; Khandekar, Rohit ; Wu, Kun-Lung

  • Author_Institution
    Univ. of Minnesota, Minneapolis, MN, USA
  • fYear
    2012
  • fDate
    18-21 June 2012
  • Firstpage
    466
  • Lastpage
    475
  • Abstract
    In this paper, we propose techniques for clustering large-scale "streaming" graphs where the updates to a graph are given in form of a stream of vertex or edge additions and deletions. Our algorithm handles such updates in an online and incremental manner and it can be easily parallel zed. Several previous graph clustering algorithms fall short of handling massive and streaming graphs because they are centralized, they need to know the entire graph beforehand and are not incremental, or they incur an excessive computational overhead. Our algorithm\´s fundamental building block is called graph reservoir sampling. We maintain a reservoir sample of the edges as the graph changes while satisfying certain desired properties like bounding number of clusters or cluster-sizes. We then declare connected components in the sampled sub graph as clusters of the original graph. Our experiments on real graphs show that our approach not only yields clusterings with very good quality, but also obtains orders of magnitude higher throughput, when compared to offline algorithms.
  • Keywords
    graph theory; pattern clustering; computational overhead; fundamental building block; graph clustering algorithm; graph reservoir sampling; large-scale streaming graphs clustering; Clustering algorithms; Complexity theory; Heuristic algorithms; Image edge detection; Partitioning algorithms; Reservoirs; Throughput; clustering; graph; online; streaming;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on
  • Conference_Location
    Macau
  • ISSN
    1063-6927
  • Print_ISBN
    978-1-4577-0295-2
  • Type

    conf

  • DOI
    10.1109/ICDCS.2012.20
  • Filename
    6258019