DocumentCode :
2727431
Title :
Fact Discovery in Wikipedia
Author :
Adafre, Sisay Fissaha ; Jijkoun, Valentin ; De Rijke, Maarten
Author_Institution :
Dublin City Univ., Dublin
fYear :
2007
fDate :
2-5 Nov. 2007
Firstpage :
177
Lastpage :
183
Abstract :
We address the task of extracting focused salient information items, relevant and important for a given topic, from a large encyclopedic resource. Specifically, for a given topic (a Wikipedia article) we identify snippets from other articles in Wikipedia that contain important information for the topic of the original article, without duplicates. We compare several methods for addressing the task, and find that a mixture of content-based, link-based, and layout-based features outperforms other methods, especially in combination with the use of so-called reference corpora that capture the key properties of entities of a common type.
Keywords :
Web sites; information retrieval; Wikipedia; content-based features; encyclopedic resource; fact discovery; layout-based features; link-based features; salient information item extraction; Data mining; Encyclopedias; Natural languages; Radio access networks; Search engines; Uniform resource locators; Web search; Wikipedia;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence, IEEE/WIC/ACM International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-0-7695-3026-0
Type :
conf
DOI :
10.1109/WI.2007.133
Filename :
4427085
Link To Document :
بازگشت