DocumentCode :
3064166
Title :
Wikipedia-Graph Based Key Concept Extraction towards News Analysis
Author :
Zhou, Baoyao ; Luo, Ping ; Xiong, Yuhong ; Liu, Wei
Author_Institution :
HP Labs. China, Hewlett-Packard Co., Beijing, China
fYear :
2009
fDate :
20-23 July 2009
Firstpage :
121
Lastpage :
128
Abstract :
The well-known Wikipedia can serve as a comprehensive knowledge repository to facilitate textual content analysis, due to its abundance, high quality and well-structuring. In this paper, we propose WikiRank - a Wikipedia-graph based ranking model, which can be used to extract key Wikipedia concepts from a document. These key concepts can be regarded as the most salient terms to represent the theme of the document. Different from other existing graph-based ranking algorithms, the concept graph used for ranking in this model is constructed by leveraging not only the co-occurrence relations within the local context of a document but also the preprocessed hyperlink-structure of Wikipedia. We have applied the proposed WikiRank model with the Support Propagation ranking algorithm to analyze the news articles, especially for enterprise news. These promising applications include Wikipedia Concept Linking and Enterprise Concept Cloud Generation.
Keywords :
Internet; graph theory; information resources; Wikipedia hyperlink-structure; Wikipedia-graph based ranking model; enterprise concept cloud generation; graph based key concept extraction; knowledge repository; news analysis; support propagation ranking algorithm; textual content analysis; Algorithm design and analysis; Business; Clouds; Companies; Context modeling; Graph theory; Iterative algorithms; Joining processes; Navigation; Wikipedia; Key concept extraction; Wikipedia Concept Graph; graph theory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Commerce and Enterprise Computing, 2009. CEC '09. IEEE Conference on
Conference_Location :
Vienna
Print_ISBN :
978-0-7695-3755-9
Type :
conf
DOI :
10.1109/CEC.2009.54
Filename :
5210808
Link To Document :
بازگشت