DocumentCode
174899
Title
Keyphrase Extraction Abstracts Instead of Full Papers
Author
Popova, S. ; Danilova, V.
Author_Institution
St.-Petersburg State Univ., St. Petersburg, Russia
fYear
2014
fDate
1-5 Sept. 2014
Firstpage
241
Lastpage
245
Abstract
In the present paper we consider keyphrase extraction problem from scientific articles. Finding an appropriate solution is important for the organization of fast navigation in databases, indexing, clustering and classification of academic papers. The base collection includes keyphrases selected by the experts for each text (SemEval2010). It is shown that the use of abstracts instead of full texts allows to improve the results obtained by processing full texts or abstracts with introduction and conclusion section. Our approach uses the extraction of keyphrases with linguistic patterns (part of speech-based), patterns are built on the basis of an auxiliary dataset. The use of abstracts in this approach allows to reduce the number of words sequences extracted with patterns, as compared to the use of full texts. It allows to simplify or totally omit the ranking stage. Ranking is usually needed, because out of many keyphrases candidates we have to choose only 10-15. This stage is the most difficult and its effectiveness depends on the number of the selected candidates to keyphrases. The use of abstracts makes it possible to considerably reduce the number of candidate phrases and at the same time yields high recall.
Keywords
data mining; information retrieval; natural language processing; text analysis; academic paper classification; academic paper clustering; academic paper indexing; keyphrase extraction; linguistic patterns; scientific articles; word sequence extraction; Abstracts; Artificial neural networks; Data mining; Feature extraction; Gold; Pragmatics; Standards; abtract processing; indexing; informational retrieval; keyphrase extraction; keyphrase identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Database and Expert Systems Applications (DEXA), 2014 25th International Workshop on
Conference_Location
Munich
ISSN
1529-4188
Print_ISBN
978-1-4799-5721-7
Type
conf
DOI
10.1109/DEXA.2014.57
Filename
6974856
Link To Document