DocumentCode
3532953
Title
Approximation algorithm for correlation clustering
Author
Mitra, Pinaki ; Samal, Mamata
Author_Institution
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Guwahati, Guwahati, India
fYear
2009
fDate
28-31 July 2009
Firstpage
140
Lastpage
145
Abstract
The correlation clustering problem has been introduced recently as a model for clustering data when a binary relationship between data points is known. Correlation clustering is a graph-theoretic based clustering. More precisely, in a graph G = (V,E) vertices of the graph denote the number of items, where each edge of the graph labeled as either + or - depending upon the similarity or dissimilarity of its end points. An optimal clustering is measured in terms of (i) Maximizing the number of similarity inside the cluster plus the number of dissimilarity between the clusters (MAXAGREE). Or (ii) Minimizing the dissimilarity inside the cluster plus the number of similarity between the clusters(MINDISAGREE). These two optimization problems are NP-complete for k ges 2. We study the problem of finding clustering in terms of maximizing the number of similarity inside the cluster plus the number of dissimilarity between the clusters (MAXAGREE). Giotis and Guruswami obtained a polynomial time factor 0.87856 approximation algorithm for MAXAGREE on general graphs, by using a random hyperplane rounding of SDP relaxation of MAXAGREE. In this paper we present a polynomial time approximation algorithm with the improved approximation ratio 3/25 + 5radic(5) = 0.884458 for MAXAGREE, by using the linear combination of vector rounding and the Gegenbauer polynomials rounding. This result is also applicable for weighted 2-cluster editing problem, since SDP formulation of weighted 2-cluster editing problem is same as MAXAGREE.
Keywords
computational complexity; graph theory; pattern clustering; polynomial approximation; Gegenbauer polynomials rounding; MAXAGREE; MINDISAGREE; NP-complete problem; correlation clustering problem; data clustering; graph-theoretic based clustering; optimization problem; polynomial time factor approximation algorithm; random hyperplane rounding; weighted 2-cluster editing problem; Approximation algorithms; Clustering algorithms; Computational biology; Computer science; Data engineering; Design optimization; Polynomials; Symmetric matrices; Time factors; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Networked Digital Technologies, 2009. NDT '09. First International Conference on
Conference_Location
Ostrava
Print_ISBN
978-1-4244-4614-8
Electronic_ISBN
978-1-4244-4615-5
Type
conf
DOI
10.1109/NDT.2009.5272169
Filename
5272169
Link To Document