Title :
Rough Set Based Approach to Base Noun Phrase Identification
Author :
Guo, Yonghui ; Ma, Fang ; Wang, Bingxi ; Li, Jian
Author_Institution :
Inst. of Electron. Technol., Inf. Eng. Univ., Zhengzhou
Abstract :
This paper is intended to present a novel rough set based approach to identifying base noun phrase (BaseNP). In this approach, we divide the whole task into two ordinal sub tasks: tagging and identifying. We regard BaseNP tagging as a decision + making process, which can be accomplished through rough set theory. What characterizes our tagging procedure is feature reduction and rule optimization. The focus of this paper lies in three aspects. First, we present a description of rough set-based rule learning mechanism and concerning algorithms. Next, we give a detailed account of the finite state transducer (FST) for BaseNP identification. Finally, we discussed the handling of instance collisions for improving system performance. Experimental procedures are described in detail and results indicate that rough set-based approach shows good prospects in natural language processing (NLP). At the end of the paper, we discussed the shortcomings of this approach and put forward suggestions as to its improvement
Keywords :
computational linguistics; knowledge based systems; learning (artificial intelligence); natural languages; rough set theory; BaseNP identification; BaseNP tagging; base noun phrase identification; decision making process; feature reduction; finite state transducer; machine learning; natural language processing; rough set theory; rough set-based rule learning; rule optimization; tagging procedure; Contracts; Learning systems; Machine learning; Machine learning algorithms; Natural languages; Set theory; System performance; Tagging; Text recognition; Transducers; base noun phrase; machine learning; rough sets; rule-based method;
Conference_Titel :
Intelligent Control and Automation, 2006. WCICA 2006. The Sixth World Congress on
Conference_Location :
Dalian
Print_ISBN :
1-4244-0332-4
DOI :
10.1109/WCICA.2006.1713146