Title : 
Parallel K-Medoids clustering algorithm based on Hadoop
         
        
            Author : 
Yaobin Jiang ; Jiongmin Zhang
         
        
            Author_Institution : 
Dept. of Comput. Sci. & Technol., East China Normal Univ., Shanghai, China
         
        
        
        
        
        
            Abstract : 
The K-Medoids clustering algorithm solves the problem of the K-Means algorithm on processing the outlier samples, but it is not be able to process big-data because of the time complexity[1]. MapReduce is a parallel programming model for processing big-data, and has been implemented in Hadoop. In order to break the big-data limits, the parallel K-Medoids algorithm HK-Medoids based on Hadoop was proposed. Every submitted job has many iterative MapReduce procedures: In the map phase, each sample was assigned to one cluster whose center is the most similar with the sample; in the combine phase, an intermediate center for each cluster was calculated; and in the reduce phase, the new center was calculated. The iterator stops when the new center is similar to the old one. The experimental results showed that HK-Medoids algorithm has a good clustering result and linear speedup for big-data.
         
        
            Keywords : 
Big Data; computational complexity; iterative methods; parallel programming; pattern clustering; HK-medoids; Hadoop; big data processing; iterative MapReduce procedures; k-means algorithm; map phase; outlier sample processing; parallel k-medoids clustering algorithm; parallel programming model; time complexity; Algorithm design and analysis; Clustering algorithms; Computational modeling; Educational institutions; Indexes; Partitioning algorithms; Programming; Big-Data; Clustering Analysis; Hadoop; K-Medoids; MapReduce;
         
        
        
        
            Conference_Titel : 
Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
         
        
            Conference_Location : 
Beijing
         
        
        
            Print_ISBN : 
978-1-4799-3278-8
         
        
        
            DOI : 
10.1109/ICSESS.2014.6933652