• DocumentCode
    2398996
  • Title

    Achieving Natural Clustering by Validating Results of Iterative Evolutionary Clustering Approach

  • Author

    Özyer, Tansel ; Alhajj, Reda

  • Author_Institution
    Dept. of Comput. Sci., Calgary Univ., Alta.
  • fYear
    2006
  • fDate
    Sept. 2006
  • Firstpage
    488
  • Lastpage
    493
  • Abstract
    Clustering is an essential process that leads to the classification of a given set of instances based on user-specified criteria; and different factors may lead to different clustering results. Thus, a large number of clustering algorithms exist to satisfy different purposes. However, scalability and the fact that algorithms in general need the number of clusters be specified a priori, which is mostly hard to estimate even for domain experts, are two challenges that motivate the development of new algorithms. This paper presents a novel approach to handle these two issues. We mainly developed a clustering method that works as an iterative approach to handle the scalability problem; and we utilize multi-objective genetic algorithm combined with validity indexes to decide on the number of clusters. The basic idea is to partition the dataset first; then cluster each partition separately. Finally, each obtained cluster is treated as a single instance (represented by its centroid) and a conquer process is performed to get the final clustering of the complete dataset. Test results on one large real dataset demonstrate the applicability and effectiveness of the proposed approach
  • Keywords
    data analysis; genetic algorithms; iterative methods; pattern classification; pattern clustering; clustering algorithms; data mining; domain experts; iterative evolutionary clustering approach; multiobjective genetic algorithm; natural clustering; user-specified criteria; validity indexes; Clustering algorithms; Computer science; Concurrent computing; Couplings; Genetic algorithms; Intelligent systems; Iterative methods; Scalability; Testing; Upper bound; classification; clustering; data mining; multi-objective genetic algorithm; partitioning; validity indexes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems, 2006 3rd International IEEE Conference on
  • Conference_Location
    London
  • Print_ISBN
    1-4244-01996-8
  • Electronic_ISBN
    1-4244-01996-8
  • Type

    conf

  • DOI
    10.1109/IS.2006.348468
  • Filename
    4155475