Title :
A new algorithm for clustering aggregation
Author_Institution :
Hunan Univ. of Commerce, Changsha, China
Abstract :
In this article, we propose a new clustering algorithm for large datasets, that is the circle algorithm. The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices. Our algorithms make use of the connection between clustering aggregation and the problem of correlation clustering. Our work provides the best deterministic approximation algorithm for the variation of the correlation clustering problem we consider. We also show how sampling can be used to scale the algorithms for large datasets. We give an extensive empirical evaluation demonstrating the usefulness of the problem and of the solutions.
Keywords :
approximation theory; data mining; pattern clustering; aggregation clustering; categorical data clustering; circle algorithm; correlation clustering problem; data mining; deterministic approximation algorithm; large dataset clustering algorithm; circle algorithm; clustering aggregation; clustering categorical data; data mining;
Conference_Titel :
Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-8727-1
DOI :
10.1109/CSAE.2011.5952875