DocumentCode :
168283
Title :
Crowd-sourcing Web knowledge for metadata extraction
Author :
Zhaohui Wu ; Wenyi Huang ; Chen Liang ; Giles, C. Lee
Author_Institution :
Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
fYear :
2014
fDate :
8-12 Sept. 2014
Firstpage :
141
Lastpage :
144
Abstract :
We explore a new metadata extraction framework without human annotators with the ground truth harvested from Web. A new training sample is selected based on not only the uncertainty and representativeness in the unlabeled pool, but also on its availability and credibility in Web knowledge bases. We construct a dataset of 4329 books with valid metadata and evaluate our approach using 5 Web book databases as oracles. Empirical results demonstrate its effectiveness and efficiency.
Keywords :
Internet; knowledge based systems; learning (artificial intelligence); meta data; query processing; Web Knowledge crowdsourcing; Web book databases; Web knowledge bases; metadata extraction framework; oracles; training sample; Abstracts; IP networks; Welding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on
Conference_Location :
London
Type :
conf
DOI :
10.1109/JCDL.2014.6970160
Filename :
6970160
Link To Document :
بازگشت