Title :
A parallel computing model for large-graph mining with MapReduce
Author :
Bin Wu ; Yuxiao Dong ; Qing Ke ; Yanan Cai
Author_Institution :
Sch. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
How can we quickly find the structures and characters of a large-scale graph? Large-scale graph exists everywhere, such as CALL graph, the World Wide Web, Facebook networks and many more. The continued exponential growth in both the size and complexity of the graphs is giving birth to a new challenge to the analysts and researchers. With respect to these challenges, a new class of algorithms and computing models is needed urgently for the large-scale graphs. An excellent promising clue for dealing with graphs with great sizes is the emerging MapReduce framework and its open-source implementation, Hadoop. The problem of 3-clique enumeration of a graph is an important operation that can help structure mining and a difficult mission for graphs with great sizes on the single computer. In this paper, we propose a parallel computing model for 3-clique enumeration based on cluster system with the help of MapReduce for large-scale graphs. The process of enumeration is firstly to extract one-leap information of the graph, then the two-leap information and finally, the key-based 3-clique enumeration. Also, we apply the computing model to the computation of clustering coefficient. More than anything else, the computing model is applied to three real-world large CALL graphs and the results of the experiments manifest the good scalability and efficiency of the model.
Keywords :
data mining; graph theory; parallel processing; CALL graph; Facebook networks; Hadoop; MapReduce framework; World Wide Web; cluster system; clustering coefficient; key-based 3-clique enumeration; large-graph mining; large-scale graph; one-leap information; parallel computing model; two-leap information; Clustering algorithms; Computational modeling; Data mining; Distributed databases; Parallel processing; Scalability; Social network services; 3-clique; MapReduce; clustering coefficient; graph mining; social network analysis;
Conference_Titel :
Natural Computation (ICNC), 2011 Seventh International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-9950-2
DOI :
10.1109/ICNC.2011.6022061