Domain Adaptive Information Extraction Using Link Grammar and WordNet

Author

Phyu, Aye Lelt Lelt ; Thein, Nilar

Author_Institution

Univ. of Comput. Studies, Yangon

fYear

2007

fDate

24-26 Jan. 2007

Firstpage

47

Lastpage

53

Abstract

Nowadays, people want to extract variety of information from on line texts. As more and more text becomes available on-line, there is emergent need for systems that extract information automatically from text corpus. One of the principle challenges of information extraction is the efficient customization of a system to a new domain. Adapting an information extraction system to a new domain entails the construction of a new set of extraction rules. Many recent information extraction systems have ignored the tedious and time-consuming nature of that process. This paper proposes an alternative approach, which generate candidate extraction rules from untagged text corpus using Link Grammar Parser and filter the final extraction rules using Wordnet and linguistic patterns. The proposed method not only reduces the amount of time and effort required to create an appropriate training corpus but also obviates the need to examine many candidate extraction rules so that the system can easily port well to different domain.

Keywords

grammars; information filters; information retrieval; text analysis; WordNet; domain adaptive information extraction; extraction rules; link grammar parser; online texts; text corpus; Data mining; Databases; Filters; Impedance matching; International collaboration; Internet; Java; Joining processes; Storms; Tornadoes;

fLanguage

English

Publisher

ieee

Conference_Titel

Creating, Connecting and Collaborating through Computing, 2007. C5 '07. The Fifth International Conference on

Conference_Location

Kyoto

Print_ISBN

0-7695-2806-6

Type

conf

DOI

10.1109/C5.2007.11

Filename

4144933