DocumentCode
3585510
Title
Keywords Extraction from Chinese Document Based on Complex Network Theory
Author
Jiangxia Nan ; Bo Xiao ; Zhiqing Lin ; Qianfang Xu
Author_Institution
Inst. of Sensing Technol. & Bus., Beijing Univ. of Posts & Telecommun. Beijing, Beijing, China
Volume
2
fYear
2014
Firstpage
383
Lastpage
386
Abstract
Keywords extraction is the process of choosing several words from a document to express its main idea. Keywords help people understand an article quickly and clearly. In recent years, more and more researchers pay attention to its research since its important role in text clustering, text classification, automatic abstracting, and text retrieval. This paper proposes an algorithm called EC-DC to extract keywords based on centrality measures of complex network. A document is mapped to a network with its words mapped to vertices and relations between words mapped to edges. Then, the importance of words is evaluated using eccentricity centrality and degree centrality. The most important K words are extracted as keywords. Experimental results show that the EC-DC algorithm has an improvement of about 9% in precision, recall and F-score compared to classical TFIDF algorithm.
Keywords
complex networks; feature extraction; text analysis; Chinese document; EC-DC algorithm; automatic abstracting; complex network centrality measures; complex network theory; degree centrality; eccentricity centrality; keywords extraction; text classification; text clustering; text retrieval; Approximation algorithms; Business; Complex networks; Data mining; Feature extraction; Internet; Semantics; complex network; degree centrality; document network; eccentricity centrality; keywords extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Design (ISCID), 2014 Seventh International Symposium on
Print_ISBN
978-1-4799-7004-9
Type
conf
DOI
10.1109/ISCID.2014.183
Filename
7082012
Link To Document