Title : 
Sequential extraction of clusters for imbalanced data
         
        
            Author : 
Hengjin Tang ; Miyamoto, Sadaaki
         
        
            Author_Institution : 
Grad. Sch. of Syst. & Inf. Eng., Univ. of Tsukuba, Tsukuba, Japan
         
        
        
        
        
        
            Abstract : 
K-means type clustering has a central role in various clustering algorithms. In spite of its usefulness, there is a well-known drawback, the number of clusters should be determined beforehand, and clustering results are strongly depends of this number. Many researchers study on how to estimate this number and one algorithm is using sequential extraction of clusters. However, the clustering results by this algorithm is severely affected by the initial parameter setting. Additionally, if the dataset consists of imbalanced clusters and shapes, the results also can be worse. To overcome such problems, we propose automatic estimation of parameter values during the clustering process. We show the effectiveness of the proposed algorithm by using numerical examples.
         
        
            Keywords : 
data analysis; parameter estimation; pattern clustering; K-means type clustering; automatic parameter value estimation; clusters sequential extraction; imbalanced data; sequential clustering; Algorithm design and analysis; Clustering algorithms; Data mining; Educational institutions; Noise; Optimization; Shape; clustering; imbalanced data; sequential extraction of clusters;
         
        
        
        
            Conference_Titel : 
Granular Computing (GrC), 2013 IEEE International Conference on
         
        
            Conference_Location : 
Beijing
         
        
        
            DOI : 
10.1109/GrC.2013.6740422