DocumentCode :
262288
Title :
A Paralleled Big Data Algorithm with MapReduce Framework for Mining Twitter Data
Author :
Li Bing ; Chan, Keith C. C.
Author_Institution :
Dept. of Comput., Hong Kong Polytech. Univ., Kowloon, China
fYear :
2014
fDate :
3-5 Dec. 2014
Firstpage :
121
Lastpage :
128
Abstract :
Some recent studies have suggested that public opinions expressed in social media may be correlated with various social issues. To find out what actually can be discovered in social media data, we need data mining. Data mining approaches that can handle massive amount of data have recently been referred to as big data algorithms. In this paper, we propose a big data algorithm to handling Twitter data mining. Furthermore, to ensure scalability, MapReduce framework is adopted to parallelize the proposed algorithm. Through the experiments, the potential of the proposed algorithm can be demonstrated. Computationally, the speed of execution can be shown to increase significantly despite increases in data set size. In fact, the acceleration ratio increases as the size of the dataset increases, and as the number of Data Nodes increases.
Keywords :
Big Data; data mining; parallel algorithms; social networking (online); DataNodes; MapReduce framework; Twitter data mining; acceleration ratio; big handling; data set size; paralleled big data algorithm; public opinions; social media data; Accuracy; Big data; Data mining; Media; Pragmatics; Twitter; Vectors; MapReduce; Twitter; big data algorithm; data mining; social media;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on
Conference_Location :
Sydney, NSW
Type :
conf
DOI :
10.1109/BDCloud.2014.26
Filename :
7034776
Link To Document :
بازگشت