DocumentCode :
495531
Title :
Chinese Verb Subcategorization Acquisition from Noisy Data on Sentence Level
Author :
Zhu, CongHui ; Zha, Tiejun ; Han, Xiwu
Author_Institution :
Key Lab. of NLP & Speech, Harbin Inst. of Technol., Harbin, China
Volume :
4
fYear :
2009
fDate :
March 31 2009-April 2 2009
Firstpage :
448
Lastpage :
452
Abstract :
Subcategorization is the process that further classifies a syntactic category into its subsets. Aiming to improve the recall of acquisition, we design an automatic approach of enriching the argument knowledge of SCF by means of active learning and employing a multi-class SVM model to classify argument type. We could thus give an accurate SCF as output for each input sentence, even on noisy data, meanwhile avoiding writing rules by hand. Our approach generates hypothesis directly without statistical filtering as the next step after generation. Experiments results indicate that the acquisition performance is significantly improved especially in the aspect of recall, which was increased from 88.83 to 99.75 in open test.
Keywords :
information filtering; knowledge acquisition; natural language processing; statistical analysis; support vector machines; Chinese verb subcategorization acquisition; multiclass SVM model; noisy data; sentence level; statistical filtering; support vector machine; Computer science; Data engineering; Educational technology; Information filtering; Information filters; Natural languages; Noise level; Support vector machine classification; Support vector machines; Testing; Chinese verb subcategorization; active learing; noisy data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Information Engineering, 2009 WRI World Congress on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-0-7695-3507-4
Type :
conf
DOI :
10.1109/CSIE.2009.361
Filename :
5171036
Link To Document :
بازگشت