DocumentCode
170369
Title
Conceptual graph based text classification
Author
Yi Wan ; Tingting He ; Xinhui Tu
Author_Institution
Sch. of Comput. Sci., Central China Normal Univ., Wuhan, China
fYear
2014
fDate
16-18 May 2014
Firstpage
104
Lastpage
108
Abstract
Most traditional Wikipedia based methods use only article content information. By organizing Wikipedia articles as a graph, multi-information such as category and structure information can be utilized in our method. In this paper, we propose a novel method to do classification by using knowledge from a conceptual graph which is built from Wikipedia. First, we build a conceptual graph from Wikipedia. Each article is considered as a concept node. Titles, hyperlinks, texts and category information are used as edges to measure the relationship between those concepts. Each text is mapped to its respective set of nodes and Personalized PageRank (random walk) is then used to generate a set of most important node which can represent the text best. Finally the two sets are scored with a measure of vector similarity. We evaluate our techniques on the standard text classification dataset (20newsgroup), the results show the effectiveness of the proposed approach.
Keywords
Web sites; graph theory; knowledge representation; pattern classification; text analysis; vectors; Wikipedia; conceptual graph; knowledge representation; personalized PageRank; random walk; text classification; vector similarity measure; Electronic publishing; Encyclopedias; Feature extraction; Internet; Knowledge based systems; Semantics; conceptual garph; personalized PageRank; semantic similarity; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Progress in Informatics and Computing (PIC), 2014 International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4799-2033-4
Type
conf
DOI
10.1109/PIC.2014.6972305
Filename
6972305
Link To Document