DocumentCode
2840436
Title
Clustering Streaming Graphs
Author
Eldawy, Ahmed ; Khandekar, Rohit ; Wu, Kun-Lung
Author_Institution
Univ. of Minnesota, Minneapolis, MN, USA
fYear
2012
fDate
18-21 June 2012
Firstpage
466
Lastpage
475
Abstract
In this paper, we propose techniques for clustering large-scale "streaming" graphs where the updates to a graph are given in form of a stream of vertex or edge additions and deletions. Our algorithm handles such updates in an online and incremental manner and it can be easily parallel zed. Several previous graph clustering algorithms fall short of handling massive and streaming graphs because they are centralized, they need to know the entire graph beforehand and are not incremental, or they incur an excessive computational overhead. Our algorithm\´s fundamental building block is called graph reservoir sampling. We maintain a reservoir sample of the edges as the graph changes while satisfying certain desired properties like bounding number of clusters or cluster-sizes. We then declare connected components in the sampled sub graph as clusters of the original graph. Our experiments on real graphs show that our approach not only yields clusterings with very good quality, but also obtains orders of magnitude higher throughput, when compared to offline algorithms.
Keywords
graph theory; pattern clustering; computational overhead; fundamental building block; graph clustering algorithm; graph reservoir sampling; large-scale streaming graphs clustering; Clustering algorithms; Complexity theory; Heuristic algorithms; Image edge detection; Partitioning algorithms; Reservoirs; Throughput; clustering; graph; online; streaming;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on
Conference_Location
Macau
ISSN
1063-6927
Print_ISBN
978-1-4577-0295-2
Type
conf
DOI
10.1109/ICDCS.2012.20
Filename
6258019
Link To Document