DocumentCode :
3701990
Title :
Real time clustering of tweets using adaptive PSO technique and MapReduce
Author :
Akhilesh P. Chunne;Uddagiri Chandrasekhar;Chetan Malhotra
Author_Institution :
School of Information Technology and Engineering(SITE) VIT University, Vellore, TN, India
fYear :
2015
fDate :
4/1/2015 12:00:00 AM
Firstpage :
452
Lastpage :
457
Abstract :
These days large amount of data is generated by social media such as Twitter, Facebook and YouTube etc. These kinds of data have very complicated structures which causes difficulty with respect to capturing, storing, analyzing, clustering and visualization of data. Recently, clustering of such data has caught the attention of researchers. For this, distinct algorithms such as K-Means are suggested to cluster the data. There is a need for an algorithm that is able to cluster the data in a lesser amount of time, in case of data stream. Hence the need to use a parallel and distributed environment using map-reduce framework. Likewise particle swarm optimization techniques are preferable for clustering problem, since it scales very well as data, dimensions increase. The paper implements PSO algorithm for clustering Twitter data using Hadoop´s map-reduce framework. The outcome illustrates that parallel PSO performs very well compared to K-Means algorithm. The results show that the F-Measure is increasing with increase in number of particles. Also the optimum number of nodes required is illustrated with experimental result.
Keywords :
"Clustering algorithms","Twitter","Algorithm design and analysis","Data mining","Mathematical model","Distributed databases","Real-time systems"
Publisher :
ieee
Conference_Titel :
Communication Technologies (GCCT), 2015 Global Conference on
Type :
conf
DOI :
10.1109/GCCT.2015.7342704
Filename :
7342704
Link To Document :
بازگشت