• DocumentCode
    2448956
  • Title

    A large scale clustering scheme for kernel K-Means

  • Author

    Zhang, Rong ; Rudnicky, Alexander I.

  • Author_Institution
    Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    4
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    289
  • Abstract
    Kernel functions can be viewed as a non-linear transformation that increases the separability of the input data by mapping them to a new high dimensional space. The incorporation of kernel functions enables the K-Means algorithm to explore the inherent data pattern in the new space. However, the previous applications of the kernel K-Means algorithm are confined to small corpora due to its expensive computation and storage cost. To overcome these obstacles, we propose a new clustering scheme which changes the clustering order from the sequence of samples to the sequence of kernels, and employs a disk-based strategy to control data. The new clustering scheme has been demonstrated to be very efficient for a large corpus by our experiments on handwritten digits recognition, in which more than 90% of the running time was saved.
  • Keywords
    handwritten character recognition; learning (artificial intelligence); matrix algebra; pattern classification; pattern clustering; data pattern; disk-based strategy; handwritten digits recognition; high dimensional space; kernel K-Means; large scale clustering scheme; nonlinear transformation; separability; Clustering algorithms; Computational efficiency; Computer science; Euclidean distance; Kernel; Large-scale systems; Learning systems; Machine learning; Partitioning algorithms; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2002. Proceedings. 16th International Conference on
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-1695-X
  • Type

    conf

  • DOI
    10.1109/ICPR.2002.1047453
  • Filename
    1047453