DocumentCode
680736
Title
Accelerating One-Pass Clustering by Cluster Selection Racing
Author
Labroche, Nicolas ; Detyniecki, Marcin ; Baerecke, Thomas
Author_Institution
LIP6, UPMC Paris 6, Paris, France
fYear
2013
fDate
4-6 Nov. 2013
Firstpage
491
Lastpage
498
Abstract
This paper introduces a racing mechanism in the cluster selection process for one-pass clustering algorithms. We focus on cases where data are not numerical vectors and where it is not necessarily possible to compute a mean for each cluster. In this case, the distance of each point to existing clusters can be computed exhaustively with a quadratic complexity which is not tractable in most of nowadays use cases. In this paper we first introduce a stochastic approach for estimating the distance of each new data point to existing clusters based on Hoeffding and Bernstein bounds, that reduces the number of computations by simultaneously selecting the quantity of data to be sampled and by eliminating the non-competitive clusters. Second, this paper shows that it is possible to improve the efficiency of our approach by reducing the theoretical values of the Hoeffding and Bernstein bounds. Our algorithms, tested on real data sets, provide significant acceleration of the one-pass clustering algorithms, while making less error (or any depending on parameters) than one-pass clustering algorithm with fixed number of comparisons with each cluster.
Keywords
computational complexity; data analysis; pattern clustering; stochastic processes; unsupervised learning; Hoeffding-Bernstein bounds; cluster selection process; cluster selection racing; data point; noncompetitive clusters; one-pass clustering acceleration; one-pass clustering algorithm; quadratic complexity; racing mechanism; stochastic approach; unsupervised learning; Acceleration; Clustering algorithms; Complexity theory; Error probability; Estimation; Partitioning algorithms; Random variables; cluster selection; one-pass clustering; racing;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence (ICTAI), 2013 IEEE 25th International Conference on
Conference_Location
Herndon, VA
ISSN
1082-3409
Print_ISBN
978-1-4799-2971-9
Type
conf
DOI
10.1109/ICTAI.2013.79
Filename
6735290
Link To Document