DocumentCode :
2847478
Title :
Clustering Aggregation
Author :
Gionis, Aristides ; Mannila, Heikki ; Tsaparas, Panayiotis
Author_Institution :
Dept. of Comput. Sci., Helsinki Univ., Finland
fYear :
2005
fDate :
05-08 April 2005
Firstpage :
341
Lastpage :
352
Abstract :
We consider the following problem: given a set of clusterings, find a clustering that agrees as much as possible with the given clusterings. This problem, clustering aggregation, appears naturally in various contexts. For example, clustering categorical data is an instance of the problem: each categorical variable can be viewed as a clustering of the input rows. Moreover, clustering aggregation can be used as a meta-clustering method to improve the robustness of clusterings. The problem formulation does not require a-priori information about the number of clusters, and it gives a naturalway for handlingmissing values. We give a formal statement of the clustering-aggregation problem, we discuss related work, and we suggest a number of algorithms. For several of the methods we provide theoretical guarantees on the quality of the solutions. We also show how sampling can be used to scale the algorithms for large data sets. We give an extensive empirical evaluation demonstrating the usefulness of the problem and of the solutions.
Keywords :
data mining; meta data; optimisation; very large databases; clustering aggregation; clustering categorical data; large data sets; meta-clustering method; Clustering algorithms; Computer science; Data analysis; Information technology; Partitioning algorithms; Robustness; Sampling methods;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on
ISSN :
1084-4627
Print_ISBN :
0-7695-2285-8
Type :
conf
DOI :
10.1109/ICDE.2005.34
Filename :
1410139
Link To Document :
بازگشت