DocumentCode :
2983317
Title :
Labels vs. Pairwise Constraints: A Unified View of Label Propagation and Constrained Spectral Clustering
Author :
Xiang Wang ; Buyue Qian ; Davidson, Ian
Author_Institution :
Dept. of Comput. Sci., Univ. of California, Davis, Davis, CA, USA
fYear :
2012
fDate :
10-13 Dec. 2012
Firstpage :
1146
Lastpage :
1151
Abstract :
In many real-world applications we can model the data as a graph with each node being an instance and the edges indicating a degree of similarity. Side information is often available in the form of labels for a small subset of instances, which gives rise to two problem settings and two types of algorithms. In the label propagation style algorithms, the known labels are propagated to the unlabeled nodes. In the constrained clustering style algorithms, known labels are first converted to pair wise constraints (Must-Link and Cannot-Link), then a constrained cut is computed as a tradeoff between minimizing the cut cost and maximizing the constraint satisfaction. Both techniques are evaluated by their ability to recover the ground truth labeling, i.e. by 0/1 loss function either directly on the labels or on the pair wise relations derived from the labels. These two fields have developed separately, but in this paper, we show that they are indeed related. This insight allows us to propose a novel way to generate constraints from the propagated labels, which our empirical study shows outperforms and is more stable than the state-of-the-art label propagation and constrained spectral clustering algorithms.
Keywords :
data mining; graph theory; learning (artificial intelligence); pattern clustering; cannot-link constraint; constrained clustering style algorithm; constrained spectral clustering; constraint satisfaction; cut cost; data mining; graph edge; graph node; ground truth labeling; label propagation style algorithm; loss function; machine learning; must-link constraint; pairwise constraint; side information; similarity degree; Clustering algorithms; Computer science; Educational institutions; Eigenvalues and eigenfunctions; Indexes; Labeling; Partitioning algorithms; constrained spectral clustering; label propagation; semi-supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2012 IEEE 12th International Conference on
Conference_Location :
Brussels
ISSN :
1550-4786
Print_ISBN :
978-1-4673-4649-8
Type :
conf
DOI :
10.1109/ICDM.2012.103
Filename :
6413794
Link To Document :
بازگشت