DocumentCode :
594761
Title :
Clustering large datasets with kernel methods
Author :
Fausser, Stefan ; Schwenker, Friedhelm
Author_Institution :
Inst. of Neural Inf. Process., Univ. of Ulm, Ulm, Germany
fYear :
2012
fDate :
11-15 Nov. 2012
Firstpage :
501
Lastpage :
504
Abstract :
Real-life datasets are becoming larger and less linear separable. Divisive clustering methods with a computation time linear to the number of samples n can handle large data but mostly assume linear boundaries between the cluster in input space. Kernel based clustering methods are able to detect nonlinear boundaries in feature space but have a quadratic computation time O(n2). In this paper, we propose a meta-algorithm that distributes small-sized subset of the large dataset, parallelized cluster these subset and merges the resulting approximate pseudo-centre repeatedly until the whole dataset has been processed. The meta-algorithm is able to use a wide range of kernel based clustering methods. Here we integrate Kernel Fuzzy C-Means and Relational Neural Gas. We analytically show that the algorithm has a linear computation time O(n). In the experiments we empirically evaluate the performance of the method on two real-life datasets.
Keywords :
fuzzy set theory; neural nets; pattern clustering; divisive clustering methods; kernel based clustering methods; kernel fuzzy c-means; kernel methods; large datasets clustering; linear computation time; meta-algorithm; nonlinear boundaries; parallelized cluster; pseudo-centre; relational neural gas; Approximation algorithms; Clustering algorithms; Clustering methods; Equations; Kernel; Partitioning algorithms; Prototypes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location :
Tsukuba
ISSN :
1051-4651
Print_ISBN :
978-1-4673-2216-4
Type :
conf
Filename :
6460181
Link To Document :
بازگشت