مرکز منطقه ای اطلاع رساني علوم و فناوري - Crowd-sourcing Web knowledge for metadata extraction

DocumentCode :

168283

Title :

Crowd-sourcing Web knowledge for metadata extraction

Author :

Zhaohui Wu ; Wenyi Huang ; Chen Liang ; Giles, C. Lee

Author_Institution :

Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA

fYear :

2014

fDate :

8-12 Sept. 2014

Firstpage :

141

Lastpage :

144

Abstract :

We explore a new metadata extraction framework without human annotators with the ground truth harvested from Web. A new training sample is selected based on not only the uncertainty and representativeness in the unlabeled pool, but also on its availability and credibility in Web knowledge bases. We construct a dataset of 4329 books with valid metadata and evaluate our approach using 5 Web book databases as oracles. Empirical results demonstrate its effectiveness and efficiency.

Keywords :

Internet; knowledge based systems; learning (artificial intelligence); meta data; query processing; Web Knowledge crowdsourcing; Web book databases; Web knowledge bases; metadata extraction framework; oracles; training sample; Abstracts; IP networks; Welding;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on

Conference_Location :

London

Type :

conf

DOI :

10.1109/JCDL.2014.6970160

Filename :

6970160

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=168283