• DocumentCode
    110144
  • Title

    Deterministic Feature Selection for K-Means Clustering

  • Author

    Boutsidis, Christos ; Magdon-Ismail, Malik

  • Author_Institution
    Christos Boutsidis Math. Sci. Dept., IBM T. J. Watson Res. Center, Ossining, NY, USA
  • Volume
    59
  • Issue
    9
  • fYear
    2013
  • fDate
    Sept. 2013
  • Firstpage
    6099
  • Lastpage
    6110
  • Abstract
    We study feature selection for k-means clustering. Although the literature contains many methods with good empirical performance, algorithms with provable theoretical behavior have only recently been developed. Unfortunately, these algorithms are randomized and fail with, say, a constant probability. We present the first deterministic feature selection algorithm for k-means clustering with relative error guarantees. At the heart of our algorithm lies a deterministic method for decompositions of the identity and a structural result which quantifies some of the tradeoffs in dimensionality reduction.
  • Keywords
    computational complexity; deterministic algorithms; feature extraction; pattern clustering; probability; unsupervised learning; constant probability; deterministic feature selection algorithm; deterministic method; dimensionality reduction; identity decomposition; k-means clustering; relative error guarantees; unsupervised learning; Approximation algorithms; Approximation methods; Clustering algorithms; Feature extraction; Linear programming; Partitioning algorithms; Vectors; Clustering methods; dimensionality reduction; unsupervised learning;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2013.2255021
  • Filename
    6488848