• DocumentCode
    3395147
  • Title

    A hybrid algorithm for k-medoid clustering of large data sets

  • Author

    Sheng, Weiguo ; Liu, Xiaohui

  • Author_Institution
    Dept. of Inf. Syst. & Comput., Brunel Univ., London, UK
  • Volume
    1
  • fYear
    2004
  • fDate
    19-23 June 2004
  • Firstpage
    77
  • Abstract
    In this paper, we propose a novel local search heuristic and then hybridize it with a genetic algorithm for k-medoid clustering of large data sets, which is an NP-hard optimization problem. The local search heuristic selects k-medoids from the data set and tries to efficiently minimize the total dissimilarity within each cluster. In order to deal with the local optimality, the local search heuristic is hybridized with a genetic algorithm and then the Hybrid K-medoid Algorithm (HKA) is proposed. Our experiments show that, compared with previous genetic algorithm based k-medoid clustering approaches - GCA and RARwGA, HKA can provide better clustering solutions and do so more efficiently. Experiments use two gene expression data sets, which may involve large noise components.
  • Keywords
    biology computing; computational complexity; data structures; genetic algorithms; pattern clustering; search problems; very large databases; GCA; NP-hard optimization problem; RARwGA; data clustering; gene expression data sets; genetic algorithm; hybrid K-medoid algorithm; hybrid algorithm; k-medoid clustering; large data sets; local optimality; local search heuristic; total dissimilarity minimization; Algorithm design and analysis; Clustering algorithms; Data analysis; Gene expression; Genetic algorithms; Information systems; Noise measurement; Noise robustness; Partitioning algorithms; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation, 2004. CEC2004. Congress on
  • Print_ISBN
    0-7803-8515-2
  • Type

    conf

  • DOI
    10.1109/CEC.2004.1330840
  • Filename
    1330840