• DocumentCode
    3532953
  • Title

    Approximation algorithm for correlation clustering

  • Author

    Mitra, Pinaki ; Samal, Mamata

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Guwahati, Guwahati, India
  • fYear
    2009
  • fDate
    28-31 July 2009
  • Firstpage
    140
  • Lastpage
    145
  • Abstract
    The correlation clustering problem has been introduced recently as a model for clustering data when a binary relationship between data points is known. Correlation clustering is a graph-theoretic based clustering. More precisely, in a graph G = (V,E) vertices of the graph denote the number of items, where each edge of the graph labeled as either + or - depending upon the similarity or dissimilarity of its end points. An optimal clustering is measured in terms of (i) Maximizing the number of similarity inside the cluster plus the number of dissimilarity between the clusters (MAXAGREE). Or (ii) Minimizing the dissimilarity inside the cluster plus the number of similarity between the clusters(MINDISAGREE). These two optimization problems are NP-complete for k ges 2. We study the problem of finding clustering in terms of maximizing the number of similarity inside the cluster plus the number of dissimilarity between the clusters (MAXAGREE). Giotis and Guruswami obtained a polynomial time factor 0.87856 approximation algorithm for MAXAGREE on general graphs, by using a random hyperplane rounding of SDP relaxation of MAXAGREE. In this paper we present a polynomial time approximation algorithm with the improved approximation ratio 3/25 + 5radic(5) = 0.884458 for MAXAGREE, by using the linear combination of vector rounding and the Gegenbauer polynomials rounding. This result is also applicable for weighted 2-cluster editing problem, since SDP formulation of weighted 2-cluster editing problem is same as MAXAGREE.
  • Keywords
    computational complexity; graph theory; pattern clustering; polynomial approximation; Gegenbauer polynomials rounding; MAXAGREE; MINDISAGREE; NP-complete problem; correlation clustering problem; data clustering; graph-theoretic based clustering; optimization problem; polynomial time factor approximation algorithm; random hyperplane rounding; weighted 2-cluster editing problem; Approximation algorithms; Clustering algorithms; Computational biology; Computer science; Data engineering; Design optimization; Polynomials; Symmetric matrices; Time factors; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networked Digital Technologies, 2009. NDT '09. First International Conference on
  • Conference_Location
    Ostrava
  • Print_ISBN
    978-1-4244-4614-8
  • Electronic_ISBN
    978-1-4244-4615-5
  • Type

    conf

  • DOI
    10.1109/NDT.2009.5272169
  • Filename
    5272169