• DocumentCode
    457248
  • Title

    GCA: A real-time grid-based clustering algorithm for large data set

  • Author

    Yu, Zhiwen ; Wong, Hau-San

  • Author_Institution
    Dept. of Comput. Sci., City Univ. of Hong Kong
  • Volume
    2
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    740
  • Lastpage
    743
  • Abstract
    Few of the current existing methods for unsupervised learning (clustering) algorithms consider clustering the data points in a low-dimensional subspace in real time. In this paper, we present a grid based clustering algorithm (GCA) with time complexity (O(n)). Unlike previous clustering algorithm, GCA pays more attention to the running time of the algorithm. GCA achieves low running time by (i) determining the number of the clusters according to the point density of the grid cell and (ii) computing the distances between the centers of the clusters and the grid cells, not the data points. In order to make GCA more efficient, principal component analysis (PCA) is introduced to transform the data points from high dimension to low dimension. Finally, we analyze the performance of GCA and show that it outperforms most of the current state-of-the-art methods in terms of efficiency. In particular, it outperforms k-means algorithm by several orders in the running time
  • Keywords
    computational complexity; pattern clustering; principal component analysis; unsupervised learning; grid cell; large data set; principal component analysis; real-time grid-based clustering algorithm; time complexity; unsupervised learning; Clustering algorithms; Computer science; Data mining; Databases; Grid computing; Kernel; Machine intelligence; Pattern recognition; Performance analysis; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2006. ICPR 2006. 18th International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-2521-0
  • Type

    conf

  • DOI
    10.1109/ICPR.2006.597
  • Filename
    1699311