Title :
Deterministic Feature Selection for K-Means Clustering
Author :
Boutsidis, Christos ; Magdon-Ismail, Malik
Author_Institution :
Christos Boutsidis Math. Sci. Dept., IBM T. J. Watson Res. Center, Ossining, NY, USA
Abstract :
We study feature selection for k-means clustering. Although the literature contains many methods with good empirical performance, algorithms with provable theoretical behavior have only recently been developed. Unfortunately, these algorithms are randomized and fail with, say, a constant probability. We present the first deterministic feature selection algorithm for k-means clustering with relative error guarantees. At the heart of our algorithm lies a deterministic method for decompositions of the identity and a structural result which quantifies some of the tradeoffs in dimensionality reduction.
Keywords :
computational complexity; deterministic algorithms; feature extraction; pattern clustering; probability; unsupervised learning; constant probability; deterministic feature selection algorithm; deterministic method; dimensionality reduction; identity decomposition; k-means clustering; relative error guarantees; unsupervised learning; Approximation algorithms; Approximation methods; Clustering algorithms; Feature extraction; Linear programming; Partitioning algorithms; Vectors; Clustering methods; dimensionality reduction; unsupervised learning;
Journal_Title :
Information Theory, IEEE Transactions on
DOI :
10.1109/TIT.2013.2255021