Title :
A MapReduce and Information Compression Based Social Community Structure Mining Method
Author :
Jin Songchang ; Li Aiping ; Yang Shuqiang ; Lin Wangqun ; Deng Bo ; Li Shudong
Author_Institution :
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
As the rapid development of social media, social community structure mining has become a popular research field in recent years. But traditional social community mining methods are not able to effectively deal with the data of large scale networks. We firstly introduce an information compression based community mining model in this paper, and with the help of the model, we transform the community mining problem into optimal information coding problem. And then propose a parallel computing method CInfoMR based on the MapReduce parallel framework to mine the social community structure. In the InfoMR, map tasks are responsible for splitting network data into a plenty of subsets, each reduce task is responsible for accomplishing community clustering by means of loop iteration on its subset, and finally all the results from the reduce phase are merged together to output. Theoretical analysis and related experiments verify the validity of the work in this paper. The results of the accuracy experiments show that, the accuracy of the InfoMR is much higher than that of Fast GN and PDST algorithm. The performance experiments on 2 real dataset and 2 simulative dataset show that InfoMR is able to accomplish the task of mining social community in a relatively short period of time on big data social networks.
Keywords :
Big Data; data compression; data mining; parallel processing; pattern clustering; social networking (online); CInfoMR; Fast GN algorithm; MapReduce parallel framework; PDST algorithm; big data social network; community clustering; information compression; large scale network; loop iteration; network data splitting; optimal information coding problem; parallel computing method; social community structure mining method; social media; Channel coding; Communities; Data handling; Data storage systems; Information management; Social network services; MapReduce; information compression; random walk; social community; social network;
Conference_Titel :
Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/CSE.2013.143