Title :
Domain Adaptive Information Extraction Using Link Grammar and WordNet
Author :
Phyu, Aye Lelt Lelt ; Thein, Nilar
Author_Institution :
Univ. of Comput. Studies, Yangon
Abstract :
Nowadays, people want to extract variety of information from on line texts. As more and more text becomes available on-line, there is emergent need for systems that extract information automatically from text corpus. One of the principle challenges of information extraction is the efficient customization of a system to a new domain. Adapting an information extraction system to a new domain entails the construction of a new set of extraction rules. Many recent information extraction systems have ignored the tedious and time-consuming nature of that process. This paper proposes an alternative approach, which generate candidate extraction rules from untagged text corpus using Link Grammar Parser and filter the final extraction rules using Wordnet and linguistic patterns. The proposed method not only reduces the amount of time and effort required to create an appropriate training corpus but also obviates the need to examine many candidate extraction rules so that the system can easily port well to different domain.
Keywords :
grammars; information filters; information retrieval; text analysis; WordNet; domain adaptive information extraction; extraction rules; link grammar parser; online texts; text corpus; Data mining; Databases; Filters; Impedance matching; International collaboration; Internet; Java; Joining processes; Storms; Tornadoes;
Conference_Titel :
Creating, Connecting and Collaborating through Computing, 2007. C5 '07. The Fifth International Conference on
Conference_Location :
Kyoto
Print_ISBN :
0-7695-2806-6