• DocumentCode
    384287
  • Title

    Fast hierarchical clustering based on compressed data

  • Author

    Rendon, Erendira ; Barandela, Ricardo

  • Author_Institution
    Pattern Recognition Lab., Technol. Inst. of Toluca, Metepec, Mexico
  • Volume
    2
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    216
  • Abstract
    Clustering in data mining is the process of discovering groups in a dataset, in such a way, that the similarity between the elements of the same cluster is maximum and between different clusters is minimal. Some algorithms attempt to group a representative sample of the whole dataset and later to perform a labeling process in order to group the rest of the original database. Other algorithms perform a pre-clustering phase and later apply some classic clustering algorithm in order to create the final clusters. We present a pre-clustering algorithm that not only provides good results and efficient optimization of main memory but it also is independent of the data input order. The efficiency of the proposed algorithm and a comparison of it with the pre-clustering BIRCH algorithm are shown.
  • Keywords
    data compression; data mining; pattern clustering; compressed data; data mining; fast hierarchical clustering; labeling process; Algorithm design and analysis; Clustering algorithms; Data analysis; Data mining; Ear; Iterative algorithms; Labeling; Pattern analysis; Pattern recognition; Spatial databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2002. Proceedings. 16th International Conference on
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-1695-X
  • Type

    conf

  • DOI
    10.1109/ICPR.2002.1048276
  • Filename
    1048276