• DocumentCode
    1994948
  • Title

    Chinese NP Chunking: A Semi-Supervised Approach

  • Author

    Lin, Yen-Hsi ; Gao, Zhao-Ming

  • Author_Institution
    Nat. Taiwan Univ., Taiwan
  • fYear
    2008
  • fDate
    15-16 Dec. 2008
  • Firstpage
    344
  • Lastpage
    346
  • Abstract
    V N and N V sequence in Chinese may be a noun phrase. This characteristic makes NP chunking in Chinese particularly difficult. We present a method to tackle this problem by combining Chinese Sinica Treebank data with unlabelled data to train a better model based on SVM. Experiments with open test data show that our proposed semi-supervised approach can achieve the accuracy of 78.79% in f-measure, enhancing the f-measure by 8.79% over the supervised approach.
  • Keywords
    learning (artificial intelligence); natural language processing; support vector machines; Chinese Sinica Treebank data; Chinese noun phrase chunking; SVM; semisupervised learning approach; statistical f-measure; Air pollution; Hidden Markov models; Natural language processing; Natural languages; Robustness; Search engines; Statistical analysis; Support vector machines; Testing; Training data; Chinese NP chunking; SVM; semi-supervised approach;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Universal Communication, 2008. ISUC '08. Second International Symposium on
  • Conference_Location
    Osaka
  • Print_ISBN
    978-0-7695-3433-6
  • Type

    conf

  • DOI
    10.1109/ISUC.2008.62
  • Filename
    4724483