• DocumentCode
    170369
  • Title

    Conceptual graph based text classification

  • Author

    Yi Wan ; Tingting He ; Xinhui Tu

  • Author_Institution
    Sch. of Comput. Sci., Central China Normal Univ., Wuhan, China
  • fYear
    2014
  • fDate
    16-18 May 2014
  • Firstpage
    104
  • Lastpage
    108
  • Abstract
    Most traditional Wikipedia based methods use only article content information. By organizing Wikipedia articles as a graph, multi-information such as category and structure information can be utilized in our method. In this paper, we propose a novel method to do classification by using knowledge from a conceptual graph which is built from Wikipedia. First, we build a conceptual graph from Wikipedia. Each article is considered as a concept node. Titles, hyperlinks, texts and category information are used as edges to measure the relationship between those concepts. Each text is mapped to its respective set of nodes and Personalized PageRank (random walk) is then used to generate a set of most important node which can represent the text best. Finally the two sets are scored with a measure of vector similarity. We evaluate our techniques on the standard text classification dataset (20newsgroup), the results show the effectiveness of the proposed approach.
  • Keywords
    Web sites; graph theory; knowledge representation; pattern classification; text analysis; vectors; Wikipedia; conceptual graph; knowledge representation; personalized PageRank; random walk; text classification; vector similarity measure; Electronic publishing; Encyclopedias; Feature extraction; Internet; Knowledge based systems; Semantics; conceptual garph; personalized PageRank; semantic similarity; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Progress in Informatics and Computing (PIC), 2014 International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4799-2033-4
  • Type

    conf

  • DOI
    10.1109/PIC.2014.6972305
  • Filename
    6972305