• DocumentCode
    694231
  • Title

    Scalable clustering with adaptive instance sampling

  • Author

    JaeKyung Yang ; ByoungJin Yu ; MyoungJin Choi

  • Author_Institution
    Dept. of Ind. & Inf. Syst. Eng., Chonbuk Nat. Univ., Jeonju, South Korea
  • fYear
    2013
  • fDate
    10-13 Dec. 2013
  • Firstpage
    1309
  • Lastpage
    1313
  • Abstract
    Most of the clustering algorithms are affected by the number of attributes and instances with respect to the computation time. Thus, the data mining community has made efforts to enable induction of the clustering efficient. Hence, scalability is naturally a critical issue that the data mining community faces. A method to handle this issue is to use a subset of all instances. This paper suggests an algorithm that enables to perform clustering efficiently. This is done by using nested partitions method for solving the noisy performance problems, which arises when using a subset of instances and adjusting the sample rate properly at each iteration. This Adaptive NPCLUSTER algorithm had better similarity in small dataset and had worse similarity in large dataset than NPCLUSTER, but it had shorter computation time than NPCLUSTER.
  • Keywords
    data mining; iterative methods; pattern clustering; sampling methods; adaptive NPCLUSTER algorithm; adaptive instance sampling; data mining community; iteration method; nested partitions method; noisy performance problems; scalable clustering algorithm; Algorithm design and analysis; Clustering algorithms; Data mining; Databases; Noise; Partitioning algorithms; Scalability; Adaptive Sampling; Clustering; Data Mining; Metaheuristics; Nested Partition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Engineering and Engineering Management (IEEM), 2013 IEEE International Conference on
  • Conference_Location
    Bangkok
  • Type

    conf

  • DOI
    10.1109/IEEM.2013.6962622
  • Filename
    6962622