• DocumentCode
    2260175
  • Title

    A method of building Chinese field association knowledge from Wikipedia

  • Author

    Wang, Li ; Yata, Susumu ; Atlam, El-Sayed ; Fuketa, Masao ; Morita, Kazuhiro ; Bando, Hiroaki ; Aoe, Jun-Ichi

  • Author_Institution
    Dept. of Inf. Sci. & Intell. Syst., Univ. of Tokushima, Tokushima, Japan
  • fYear
    2009
  • fDate
    24-27 Sept. 2009
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Field association (FA) terms form a limited set of discriminating terms that give us the knowledge to identify document fields. The primary goal of this research is to make a system that can imitate the process whereby humans recognize the fields by looking at a few Chinese FA terms in a document. This paper proposes a new approach to build a Chinese FA terms dictionary automatically from Wikipedia. 104,532 FA terms are added in the dictionary. The resulting FA terms by using this dictionary are applied to recognize the fields of 5,841 documents. The average accuracy in the experiment is 92.04%. The results show that the presented method is effective in building FA terms from Wikipedia automatically.
  • Keywords
    Web sites; dictionaries; document handling; feature extraction; natural language processing; Chinese FA term dictionary; Wikipedia; automatic Chinese field association term knowledge building; document field term identification; feature extraction; Dictionaries; Humans; Information science; Intelligent structures; Intelligent systems; Internet; Knowledge engineering; Rockets; Systems engineering and theory; Wikipedia; Chinese documents; Feature fields; Field association terms; Field recognition; Wikipedia;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4244-4538-7
  • Electronic_ISBN
    978-1-4244-4540-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2009.5313781
  • Filename
    5313781