Title :
Accelerating One-Pass Clustering by Cluster Selection Racing
Author :
Labroche, Nicolas ; Detyniecki, Marcin ; Baerecke, Thomas
Author_Institution :
LIP6, UPMC Paris 6, Paris, France
Abstract :
This paper introduces a racing mechanism in the cluster selection process for one-pass clustering algorithms. We focus on cases where data are not numerical vectors and where it is not necessarily possible to compute a mean for each cluster. In this case, the distance of each point to existing clusters can be computed exhaustively with a quadratic complexity which is not tractable in most of nowadays use cases. In this paper we first introduce a stochastic approach for estimating the distance of each new data point to existing clusters based on Hoeffding and Bernstein bounds, that reduces the number of computations by simultaneously selecting the quantity of data to be sampled and by eliminating the non-competitive clusters. Second, this paper shows that it is possible to improve the efficiency of our approach by reducing the theoretical values of the Hoeffding and Bernstein bounds. Our algorithms, tested on real data sets, provide significant acceleration of the one-pass clustering algorithms, while making less error (or any depending on parameters) than one-pass clustering algorithm with fixed number of comparisons with each cluster.
Keywords :
computational complexity; data analysis; pattern clustering; stochastic processes; unsupervised learning; Hoeffding-Bernstein bounds; cluster selection process; cluster selection racing; data point; noncompetitive clusters; one-pass clustering acceleration; one-pass clustering algorithm; quadratic complexity; racing mechanism; stochastic approach; unsupervised learning; Acceleration; Clustering algorithms; Complexity theory; Error probability; Estimation; Partitioning algorithms; Random variables; cluster selection; one-pass clustering; racing;
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2013 IEEE 25th International Conference on
Conference_Location :
Herndon, VA
Print_ISBN :
978-1-4799-2971-9
DOI :
10.1109/ICTAI.2013.79