Title :
Massive Social Network Analysis: Mining Twitter for Social Good
Author :
Ediger, David ; Jiang, Kui ; Riedy, Jason ; Bader, David A. ; Corley, Courtney ; Farber, Rob ; Reynolds, W.N.
Author_Institution :
Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Toolkit for massive graphs representing social network data. On a 128-processor Cray XMT, GraphCT estimates the betweenness centrality of an artificially generated (R-MAT) 537 million vertex, 8.6 billion edge graph in 55 minutes and a real-world graph (Kwak, et al.) with 61.6 million vertices and 1.47 billion edges in 105 minutes. We use GraphCT to analyze public data from Twitter, a microblogging network. Twitter´s message connections appear primarily tree-structured as a news dissemination system. Within the public data, however, are clusters of conversations. Using GraphCT, we can rank actors within these conversations and help analysts focus attention on a much smaller data subset.
Keywords :
data mining; information dissemination; social networking (online); tree data structures; Facebook; GraphCT; Twitter mining; dissemination system; graph characterization toolkit; massive graph; massive social network analysis; microblogging network; social good; social network; Algorithm design and analysis; Hardware; Instruction sets; Measurement; Media; Twitter;
Conference_Titel :
Parallel Processing (ICPP), 2010 39th International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4244-7913-9
Electronic_ISBN :
0190-3918
DOI :
10.1109/ICPP.2010.66