DocumentCode :
3703613
Title :
MapReduce-based k-prototypes clustering method for big data
Author :
Mohamed Aymen Ben Haj Kacem;Chiheb-Eddine Ben N´cir;Nadia Essoussi
Author_Institution :
LARODEC, Universit? de Tunis, Institut Sup?rieur de Gestion de Tunis, 41 Avenue de la libert?, cit? Bouchoucha, 2000 Le Bardo, Tunisia
fYear :
2015
Firstpage :
1
Lastpage :
7
Abstract :
Big data clustering is one of the recently challenging tasks that is used in many application domains. Traditional clustering methods are not able to deal with large-scale of data. Furthermore, Big data are often characterized by the mixed type of data, including numerical and categorical attributes. Thus, we propose in this paper the parallelization of k-prototypes clustering method (MR-KP) using MapReduce model to handle large-scale of mixed data. Experiments results show that MR-KP scales well with increasing data set sizes and achieves a close to linear speedup while maintaining the clustering accuracy.
Keywords :
"Big data","Clustering methods","Data models","Clustering algorithms","Numerical models","Computational modeling","Prototypes"
Publisher :
ieee
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
Print_ISBN :
978-1-4673-8272-4
Type :
conf
DOI :
10.1109/DSAA.2015.7344894
Filename :
7344894
Link To Document :
بازگشت