• DocumentCode
    10248
  • Title

    Similarity Measures for Comparing Biclusterings

  • Author

    Horta, Danilo ; Campello, Ricardo J. G. B.

  • Author_Institution
    Inst. de Cienc. Mat. e de Comput. - ICMC, Univ. de Sao Paulo - Campus de Sao Carlos, Sao Carlos, Brazil
  • Volume
    11
  • Issue
    5
  • fYear
    2014
  • fDate
    Sept.-Oct. 1 2014
  • Firstpage
    942
  • Lastpage
    954
  • Abstract
    The comparison of ordinary partitions of a set of objects is well established in the clustering literature, which comprehends several studies on the analysis of the properties of similarity measures for comparing partitions. However, similarity measures for clusterings are not readily applicable to biclusterings, since each bicluster is a tuple of two sets (of rows and columns), whereas a cluster is only a single set (of rows). Some biclustering similarity measures have been defined as minor contributions in papers which primarily report on proposals and evaluation of biclustering algorithms or comparative analyses of biclustering algorithms. The consequence is that some desirable properties of such measures have been overlooked in the literature. We review 14 biclustering similarity measures. We define eight desirable properties of a biclustering measure, discuss their importance, and prove which properties each of the reviewed measures has. We show examples drawn and inspired from important studies in which several biclustering measures convey misleading evaluations due to the absence of one or more of the discussed properties. We also advocate the use of a more general comparison approach that is based on the idea of transforming the original problem of comparing biclusterings into an equivalent problem of comparing clustering partitions with overlapping clusters.
  • Keywords
    biology computing; genetics; pattern clustering; biclustering algorithms; clustering literature; clustering partitions; comparative analysis; gene expression; overlapping clusters; Algorithm design and analysis; Bioinformatics; Clustering algorithms; Computational biology; Gene expression; Biclustering similarity measure; external evaluation; gene expression; validity index;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2325016
  • Filename
    6817611