DocumentCode
1479840
Title
Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval
Author
Carpineto, Claudio ; Romano, Giovanni
Author_Institution
Fondazione Ugo Bordoni, Rome, Italy
Volume
34
Issue
12
fYear
2012
Firstpage
2315
Lastpage
2326
Abstract
We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
Keywords
information retrieval; optimisation; pattern clustering; probability; stochastic processes; PRI value; consensus clustering; object-pair level agreement; object-pair level disagreement; occurrence probability; optimization problem; performance gain; probabilistic Rand index; similarity measurement; stochastic optimization algorithm; subtopic retrieval; Clustering algorithms; Indexes; Information retrieval; Optimized production technology; Partitioning algorithms; Probabilistic logic; Search problems; Consensus clustering; Rand index; probabilistic Rand index; search results clustering; subtopic retrieval;
fLanguage
English
Journal_Title
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher
ieee
ISSN
0162-8828
Type
jour
DOI
10.1109/TPAMI.2012.80
Filename
6175906
Link To Document