DocumentCode :
1910752
Title :
Semi-Supervised Classification of Network Data Using Very Few Labels
Author :
Lin, Frank ; Cohen, William W.
Author_Institution :
Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear :
2010
fDate :
9-11 Aug. 2010
Firstpage :
192
Lastpage :
199
Abstract :
The goal of semi-supervised learning (SSL) methods is to reduce the amount of labeled training data required by learning from both labeled and unlabeled instances. Macskassy and Provost (2007) proposed the weighted-vote relational neighbor classifier (wvRN) as a simple yet effective baseline for semi-supervised learning on network data. It is similar to many recent graph-based SSL methods and is shown to be essentially the same as the Gaussian-field harmonic functions classifier proposed by Zhu et al. (2003) and proves to be very effective on some benchmark network datasets. We describe another simple and intuitive semi-supervised learning method based on random graph walk that outperforms wvRN by a large margin on several benchmark datasets when very few labels are available. Additionally, we show that using authoritative instances as training seeds --- instances that arguably cost much less to label --- dramatically reduces the amount of labeled data required to achieve the same classification accuracy. For some existing state-of-the-art semi-supervised learning methods the labeled data needed is reduced by a factor of 50.
Keywords :
Gaussian processes; graph theory; learning (artificial intelligence); pattern classification; Gaussian field classifier; network data; random graph walk; semisupervised classification; semisupervised learning methods; weighted vote relational neighbor classifier; Accuracy; Blogs; Equations; Labeling; Learning systems; Training; Training data; label propagation; learning on graph data; learning on network data; semi-supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Social Networks Analysis and Mining (ASONAM), 2010 International Conference on
Conference_Location :
Odense
Print_ISBN :
978-1-4244-7787-6
Electronic_ISBN :
978-0-7695-4138-9
Type :
conf
DOI :
10.1109/ASONAM.2010.19
Filename :
5562771
Link To Document :
بازگشت