• DocumentCode
    615247
  • Title

    Exploring of clustering algorithm on class-imbalanced data

  • Author

    Li Xuan ; Chen Zhigang ; Yang Fan

  • Author_Institution
    Dept. of Autom., Xiamen Univ., Xiamen, China
  • fYear
    2013
  • fDate
    26-28 April 2013
  • Firstpage
    89
  • Lastpage
    93
  • Abstract
    Imbalanced data distribution still remains an unsolved problem in data mining and machine learning. This paper introduces the problem of the class-imbalanced data in classification learning and naturally introduces it into the clustering learning since data clustering is an important and frequently used unsupervised learning method. In this paper, two verification methods based on two different aspects of original data are proposed to test and verify the influence of class-imbalanced data on clustering. Furthermore, we also conduct some experiments on different imbalanced-ratios to exploring its importance in clustering algorithm since is a very important factor for the performance in classification learning. Experimental results indicate that the class-imbalance of the dataset can seriously influence the final performance and efficiency of the clustering algorithm, and the higher the ratio, the higher the adverse effects of the clustering performance based on class-imbalanced data.
  • Keywords
    data handling; data mining; pattern classification; pattern clustering; unsupervised learning; class-imbalanced data clustering algorithm; classification learning; data mining; imbalanced data distribution; machine learning; unsupervised learning method; verification methods; Computers; Heart; Class-imbalanced Data; Clustering Algorithm; Imbalanced- ratios;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science & Education (ICCSE), 2013 8th International Conference on
  • Conference_Location
    Colombo
  • Print_ISBN
    978-1-4673-4464-7
  • Type

    conf

  • DOI
    10.1109/ICCSE.2013.6553890
  • Filename
    6553890