Title :
Acquisition of linguistic patterns for knowledge-based information extraction
Author :
Kim, Jun-Tae ; Moldovan, Dan I.
Author_Institution :
Dept. of Comput. Eng., Dongguk Univ., Seoul, South Korea
fDate :
10/1/1995 12:00:00 AM
Abstract :
The paper presents an automatic acquisition of linguistic patterns that can be used for knowledge based information extraction from texts. In knowledge based information extraction, linguistic patterns play a central role in the recognition and classification of input texts. Although the knowledge based approach has been proved effective for information extraction on limited domains, there are difficulties in construction of a large number of domain specific linguistic patterns. Manual creation of patterns is time consuming and error prone, even for a small application domain. To solve the scalability and the portability problem, an automatic acquisition of patterns must be provided. We present the PALKA (Parallel Automatic Linguistic Knowledge Acquisition) system that acquires linguistic patterns from a set of domain specific training texts and their desired outputs. A specialized representation of patterns called FP structures has been defined. Patterns are constructed in the form of FP structures from training texts, and the acquired patterns are tuned further through the generalization of semantic constraints. Inductive learning mechanism is applied in the generalization step. The PALKA system has been used to generate patterns for our information extraction system developed for the fourth Message Understanding Conference (MUC-4)
Keywords :
knowledge acquisition; knowledge based systems; learning by example; linguistics; natural languages; pattern recognition; word processing; FP structures; PALKA; Parallel Automatic Linguistic Knowledge Acquisition; automatic acquisition; domain specific linguistic patterns; domain specific training text; input text; knowledge based information extraction; knowledge based natural language processing; knowledge-based information extraction; linguistic pattern acquisition; semantic constraints; Data mining; Instruments; Knowledge acquisition; Learning systems; Natural language processing; Pattern analysis; Scalability; Terrorism; Text analysis; Text processing;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on