Title :
Mining Linked Open Data through Semi-supervised Learning Methods Based on Self-Training
Author :
Fanizzi, Nicola ; dAmato, C. ; Esposito, Floriana
Author_Institution :
Dipt. di Inf., Univ. degli studi di Bari, Bari, Italy
Abstract :
The paper tackles the problem of mining linked open data. The inherent lack of knowledge caused by the open-world assumption made on the semantic of the data model determines an abundance of data of uncertain classification. We present a semi-supervised machine learning approach. Specifically a self-training strategy is adopted which iteratively uses labeled instances to predict a label also for unlabeled instances. The approach is empirically evaluated with an extensive experimentation involving several different algorithms demonstrating the added value yielded by a semi-supervised approach over standard supervised methods.
Keywords :
data mining; data models; learning (artificial intelligence); open systems; pattern classification; semantic Web; data interoperability; data model semantic; labeled instances; linked open data mining; self-training strategy; semantic Web; semi-supervised machine learning methods; uncertain classification; unlabeled instances; Data mining; Knowledge based systems; Prediction algorithms; Predictive models; Semantic Web; Semisupervised learning; Training;
Conference_Titel :
Semantic Computing (ICSC), 2012 IEEE Sixth International Conference on
Conference_Location :
Palermo
Print_ISBN :
978-1-4673-4433-3
DOI :
10.1109/ICSC.2012.54