Title :
MapReduce Based Method for Big Data Semantic Clustering
Author :
Jie Yang ; Xiaoping Li
Author_Institution :
Sch. of Comput. Sci. & Eng., Southeast Univ., Nanjing, China
Abstract :
Big data analysis is very hot in cloud computing environments. How to automatically map heterogeneous data with the same semantics is one of the key problems in big data analysis. A big data clustering method based on the MapReduce framework is proposed in this paper. Big data are decomposed into many data chunks for parallel clustering, which is implemented by Ant Colony. Data elements are moved and clustered by ants according to the presented criterion. The proposed method is compared with the MapReduce framework based k-means clustering algorithm on a great amount of practical data. Experimental results show that the proposal is much effective for big data clustering.
Keywords :
cloud computing; data handling; optimisation; pattern clustering; MapReduce based method; ant colony; big data analysis; big data semantic clustering; cloud computing environments; data chunks; k-means clustering algorithm; parallel clustering; Accuracy; Algorithm design and analysis; Clustering algorithms; Data handling; Data storage systems; Information management; Semantics; Ant colony; MapReduce; big data; cloud computing; k-means;
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on
Conference_Location :
Manchester
DOI :
10.1109/SMC.2013.480