Title :
An Approach of Vector Space Model to Link Concrete Concepts with Wiki Entities
Author :
Lucas Borges Monteiro;Li Weigang;Ahmed Abdelfattah Saleh
Author_Institution :
Dept. of Comput. Sci., Univ. of Brasilia, Brasilia, Brazil
Abstract :
Entity Linking (EL) search and labeling are important research topics with various web applications. The challenge is to find and link the important concepts from web text to online encyclopedia databases instead of simple personal and place names. This paper presents a new approach to link concrete concepts from English texts with Wiki entities. Using part-of-speech tagging to detect concrete concepts, Vector Space Model has been applied to perform the disambiguation and selection of Wiki entities. Comparing to existing method, the proposed framework, named UnBWiki VSM, achieved satisfactory result and was adjusted with the Wikilinks database of 2.8 million entities and 18 million words. As for a case study, five web texts from the British Royal Family History were analyzed manually. The results obtained using UnBWiki VSM were satisfactory, with recall of 73.5% for the five selected texts. This process shows the perspective for the automatic Entity Linking (EL) from web pages.
Keywords :
"Concrete","Encyclopedias","Internet","Electronic publishing","Databases","Context"
Conference_Titel :
Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM), 2015 IEEE International Conference on
DOI :
10.1109/CIT/IUCC/DASC/PICOM.2015.45