DocumentCode :
2912
Title :
Named Entity Disambiguation over Texts Written in the Portuguese or Spanish Languages
Author :
Santos, Joao Tiago Luis ; Anastacio, Ivo Miguel ; Martins, Bruno Emanuel
Author_Institution :
Inst. Super. Tecnico e INESC-ID, Univ. de Lisboa (IST/UL), Lisbon, Portugal
Volume :
13
Issue :
3
fYear :
2015
fDate :
Mar-15
Firstpage :
856
Lastpage :
862
Abstract :
This article addresses the problem of disambiguating named entities, in text documents, towards entries in a knowledge base like Wikipedia. The proposed approach uses supervised learning to sort candidate knowledge base entries for each entity mentioned in a text, and then to classify the entry ranked in the first position as either the correct disambiguation or not. We present results with Portuguese and Spanish texts for a wide range of models and configuration options. Our experiments attest to the effectiveness of supervised learning methods in this specific task, showing that out-of-the-box algorithms and relatively simple features can achieve a high accuracy.
Keywords :
Web sites; knowledge based systems; learning (artificial intelligence); natural language processing; text analysis; Portuguese language; Portuguese text; Spanish language; Spanish text; knowledge base entry; knowledge base like Wikipedia; named entity disambiguation; out-of-the-box algorithm; supervised learning method; text document; Abstracts; Electronic publishing; Encyclopedias; Google; Knowledge based systems; Supervised learning; Information Extraction; Named Entity Disambiguation; Supervised Machine Learning;
fLanguage :
English
Journal_Title :
Latin America Transactions, IEEE (Revista IEEE America Latina)
Publisher :
ieee
ISSN :
1548-0992
Type :
jour
DOI :
10.1109/TLA.2015.7069115
Filename :
7069115
Link To Document :
بازگشت