• Title of article

    On voting-based consensus of cluster ensembles

  • Author/Authors

    Ayad، نويسنده , , Hanan G. and Kamel، نويسنده , , Mohamed S.، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2010
  • Pages
    11
  • From page
    1943
  • To page
    1953
  • Abstract
    Voting-based consensus clustering refers to a distinct class of consensus methods in which the cluster label mismatch problem is explicitly addressed. The voting problem is defined as the problem of finding the optimal relabeling of a given partition with respect to a reference partition. It is commonly formulated as a weighted bipartite matching problem. In this paper, we present a more general formulation of the voting problem as a regression problem with multiple-response and multiple-input variables. We show that a recently introduced cumulative voting scheme is a special case corresponding to a linear regression method. We use a randomized ensemble generation technique, where an overproduced number of clusters is randomly selected for each ensemble partition. We apply an information theoretic algorithm for extracting the consensus clustering from the aggregated ensemble representation and for estimating the number of clusters. We apply it in conjunction with bipartite matching and cumulative voting. We present empirical evidence showing substantial improvements in clustering accuracy, stability, and estimation of the true number of clusters based on cumulative voting. The improvements are achieved in comparison to consensus algorithms based on bipartite matching, which perform very poorly with the chosen ensemble generation technique, and also to other recent consensus algorithms.
  • Keywords
    Cluster ensembles , Voting-based consensus , Clustering
  • Journal title
    PATTERN RECOGNITION
  • Serial Year
    2010
  • Journal title
    PATTERN RECOGNITION
  • Record number

    1733497