DocumentCode :
2358768
Title :
A global rule induction approach to information extraction
Author :
Xiao, Jing ; Chua, Tat-Seng ; Liu, Jimin
Author_Institution :
Sch. of Comput., National Univ. of Singapore, Singapore
fYear :
2003
fDate :
3-5 Nov. 2003
Firstpage :
530
Lastpage :
536
Abstract :
The ability to extract desired pieces of information from natural language texts is an important task with a growing number of potential applications. This paper presents a pattern rule induction learning system, GRID, which emphasizes on utilizing global feature distribution in all of the training instances in order to make better decision on rule induction. GRID incorporates features at lexical, syntactical and semantic levels simultaneously. It induces rules by adopting a combination of top-down and bottom-up approaches. The features chosen in GRID are general and they were applied successfully to both semi-structured text and free text. Our experimental results on some publicly available Webpage corpora and MUC-4 test set indicate that our approach is effective.
Keywords :
Web sites; inductive logic programming; information retrieval; learning (artificial intelligence); natural languages; text analysis; GRID; MUC-4 test set; Webpage corpora; bottom-up approach; free text; global feature distribution; global rule induction; information extraction; lexical level features; natural language texts; pattern rule induction learning system; semantic level features; semistructured text; syntactical level features; top-down approach; Data mining; IEEE news; Induction generators; Information resources; Information retrieval; Learning systems; Natural languages; Seminars; Testing; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2003. Proceedings. 15th IEEE International Conference on
ISSN :
1082-3409
Print_ISBN :
0-7695-2038-3
Type :
conf
DOI :
10.1109/TAI.2003.1250236
Filename :
1250236
Link To Document :
بازگشت