Title :
Chinese NP Chunking: A Semi-Supervised Approach
Author :
Lin, Yen-Hsi ; Gao, Zhao-Ming
Author_Institution :
Nat. Taiwan Univ., Taiwan
Abstract :
V N and N V sequence in Chinese may be a noun phrase. This characteristic makes NP chunking in Chinese particularly difficult. We present a method to tackle this problem by combining Chinese Sinica Treebank data with unlabelled data to train a better model based on SVM. Experiments with open test data show that our proposed semi-supervised approach can achieve the accuracy of 78.79% in f-measure, enhancing the f-measure by 8.79% over the supervised approach.
Keywords :
learning (artificial intelligence); natural language processing; support vector machines; Chinese Sinica Treebank data; Chinese noun phrase chunking; SVM; semisupervised learning approach; statistical f-measure; Air pollution; Hidden Markov models; Natural language processing; Natural languages; Robustness; Search engines; Statistical analysis; Support vector machines; Testing; Training data; Chinese NP chunking; SVM; semi-supervised approach;
Conference_Titel :
Universal Communication, 2008. ISUC '08. Second International Symposium on
Conference_Location :
Osaka
Print_ISBN :
978-0-7695-3433-6
DOI :
10.1109/ISUC.2008.62