• DocumentCode
    1479840
  • Title

    Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval

  • Author

    Carpineto, Claudio ; Romano, Giovanni

  • Author_Institution
    Fondazione Ugo Bordoni, Rome, Italy
  • Volume
    34
  • Issue
    12
  • fYear
    2012
  • Firstpage
    2315
  • Lastpage
    2326
  • Abstract
    We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
  • Keywords
    information retrieval; optimisation; pattern clustering; probability; stochastic processes; PRI value; consensus clustering; object-pair level agreement; object-pair level disagreement; occurrence probability; optimization problem; performance gain; probabilistic Rand index; similarity measurement; stochastic optimization algorithm; subtopic retrieval; Clustering algorithms; Indexes; Information retrieval; Optimized production technology; Partitioning algorithms; Probabilistic logic; Search problems; Consensus clustering; Rand index; probabilistic Rand index; search results clustering; subtopic retrieval;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2012.80
  • Filename
    6175906