DocumentCode
3316892
Title
Automatically acquiring part of speech correcting rules of multi-category words based on incomplete decision tables
Author
Wang, Suge ; Yang, Junling ; Li, Deyu ; Zhang, Wu
Author_Institution
Sch. of Comput. Eng. & Sci., Shanghai Univ., China
fYear
2005
fDate
30 Oct.-1 Nov. 2005
Firstpage
68
Lastpage
72
Abstract
Part of speech (POS) tagging is a basic subject for Chinese information processing. In general, the existence of multi-category words greatly affects the processing quality of corpora. High efficient methods and automatically correcting techniques for multi-category word tagging are the keys for improving tagging precision. In this paper, for part of speech correcting of multi-category word, a modeling method is introduced based on an incomplete decision table and two algorithms for attribute reduction and object reduction used for automatically acquiring correcting rules are presented based on attribute significance. The results of testing show the validity of our method for improving part of speech tagging precision in large corpora engineering.
Keywords
computational linguistics; decision tables; natural languages; speech processing; Chinese information processing; POS; decision tables; multicategory word tagging; part of speech; Context modeling; Information processing; Information technology; Large-scale systems; Mathematics; Set theory; Speech processing; Statistics; Tagging; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN
0-7803-9361-9
Type
conf
DOI
10.1109/NLPKE.2005.1598709
Filename
1598709
Link To Document