DocumentCode :
167654
Title :
SOM Clustering Using Spark-MapReduce
Author :
Sarazin, Tugdual ; Azzag, Hanane ; Lebbah, Mustapha
Author_Institution :
ALTIC, Paris, France
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
1727
Lastpage :
1734
Abstract :
In this paper, we consider designing clustering algorithms that can be used in MapReduce using Spark platform, one of the most popular programming environment for processing large datasets. We focus on the practical and popular serial Self-organizing Map clustering algorithm (SOM). SOM is one of the famous unsupervised learning algorithms and it´s useful for cluster analysis of large quantities of data. We have designed two scalable implementations of SOM-MapReduce algorithm. We report the experiments and demonstrated the performance in terms of classification accuracy, rand, speedup using real and synthetic data with 100 millions of points, using different cores.
Keywords :
pattern clustering; self-organising feature maps; unsupervised learning; MapReduce; SOM clustering; Spark platform; classification accuracy; self-organizing map clustering; unsupervised learning algorithm; Algorithm design and analysis; Clustering algorithms; Machine learning algorithms; Programming; Prototypes; Sparks; Vectors; Clustering; MapReduce; Self-Organizing Map; Spark;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
Type :
conf
DOI :
10.1109/IPDPSW.2014.192
Filename :
6969583
Link To Document :
بازگشت