DocumentCode :
2509489
Title :
An Architecture for Finding Entities on the Web
Author :
Demartini, Gianluca ; Firan, Claudiu S. ; Georgescu, Mihai ; Iofciu, Tereza ; Krestel, Ralf ; Nejdl, Wolfgang
Author_Institution :
L3S Res. Center, Univ. of Hanover, Hanover, Germany
fYear :
2009
fDate :
9-11 Nov. 2009
Firstpage :
230
Lastpage :
237
Abstract :
Recent progress in research fields such as information extraction and information retrieval enables the creation of systems providing better search experiences to Web users. For example, systems that retrieve entities instead of just documents have been built. In this paper we present an approach for large-scale entity retrieval using Web collections as underlying corpus. We propose an architecture for entity extraction and entity ranking starting from Web documents. This is obtained (1) using an existing Web document index and (2) creating an entity centric index. We describe advantages and feasibility of our approach using state-of-the-art tools.
Keywords :
Internet; document handling; information retrieval; Web collections; Web document index; Web documents; World Wide Web; entity centric index; entity extraction; entity ranking; information extraction; information retrieval; large-scale entity retrieval; Data mining; Erbium; Image retrieval; Information retrieval; Natural language processing; Search engines; Service oriented architecture; Web pages; Web search; Wikipedia; entity retrieval; natural language processing; web search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Congress, 2009. LA-WEB '09. Latin American
Conference_Location :
Merida, Yucatan
Print_ISBN :
978-0-7695-3856-3
Type :
conf
DOI :
10.1109/LA-WEB.2009.14
Filename :
5341521
Link To Document :
بازگشت