Title :
SOM Clustering Using Spark-MapReduce
Author :
Sarazin, Tugdual ; Azzag, Hanane ; Lebbah, Mustapha
Author_Institution :
ALTIC, Paris, France
Abstract :
In this paper, we consider designing clustering algorithms that can be used in MapReduce using Spark platform, one of the most popular programming environment for processing large datasets. We focus on the practical and popular serial Self-organizing Map clustering algorithm (SOM). SOM is one of the famous unsupervised learning algorithms and it´s useful for cluster analysis of large quantities of data. We have designed two scalable implementations of SOM-MapReduce algorithm. We report the experiments and demonstrated the performance in terms of classification accuracy, rand, speedup using real and synthetic data with 100 millions of points, using different cores.
Keywords :
pattern clustering; self-organising feature maps; unsupervised learning; MapReduce; SOM clustering; Spark platform; classification accuracy; self-organizing map clustering; unsupervised learning algorithm; Algorithm design and analysis; Clustering algorithms; Machine learning algorithms; Programming; Prototypes; Sparks; Vectors; Clustering; MapReduce; Self-Organizing Map; Spark;
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
DOI :
10.1109/IPDPSW.2014.192