مرکز منطقه ای اطلاع رساني علوم و فناوري - Using Non-textual Terms for Boosting Document Keyphrase Assignment

DocumentCode :

3740117

Title :

Using Non-textual Terms for Boosting Document Keyphrase Assignment

Author :

Raquel Silveira;Vasco Furtado;Vl?dia

Author_Institution :

Programa de Pos-Grad. em Inf. Aplic., Univ. de Fortaleza (UNIFOR), Fortaleza, Brazil

Volume :

fYear :

2015

Firstpage :

260

Lastpage :

267

Abstract :

Machine-learning state-of-the-art keyphrase extraction systems do not take into consideration the fact that part of these keyphrases may not be found in the text. Therefore these systems typically use a training set restricted to textual terms, reducing the learning capabilities of any inductive algorithm. Our research investigates ways to improve the accuracy of these systems by allowing classification algorithms to learn from non-textual terms as well. The basic assumption we have followed is that non-textual terms can be included into the training set by inference from an eventual semantic relationship with textual terms. In order to discover the latent relationship between non-textual and textual terms, we propose deductive strategies to be applied in common sense bases such as Wikipedia. We show that algorithms that follow our approach outperform others that do not use the same methods introduced here.

Keywords :

"Encyclopedias","Electronic publishing","Internet","Semantics","Feature extraction","Training"

Publisher :

ieee

Conference_Titel :

Web Intelligence and Intelligent Agent Technology (WI-IAT), 2015 IEEE / WIC / ACM International Conference on

Type :

conf

DOI :

10.1109/WI-IAT.2015.216

Filename :

7396813

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3740117