Title :
Creating Extraction Pattern by Combining Part of Speech Tagger and Grammatical Parser
Author :
Sari, Yunita ; Hassan, Mohd Fadzil ; Zamin, Norshuhani
Author_Institution :
Dept. of Comput. & Inf. Sci., Univ. Teknol. Petronas, Tronoh, Malaysia
Abstract :
Most of the previous works in the field of extraction pattern are based on the usage of syntactic analyzer and semantic tagger to create a pattern that could extract relevant information from free text documents or more structured documents like Web pages. In this paper, we propose an approach to create a set of extraction pattern by combining a particular part of speech (POS) tagger and grammatical parser, i.e. Stanford POS Tagger and Link Grammar Parser (LG). The extraction pattern will be used in a name entity recognition (NER) system to identify the occurrences of some entities in free text documents. We demonstrate the algorithm on accident report as a case study.
Keywords :
computational linguistics; feature extraction; grammars; text analysis; Web page; free text document; grammatical parser; name entity recognition; part-of-speech tagger; pattern extraction; semantic tagger; syntactic analyzer; Accidents; Data mining; Dictionaries; Information analysis; Information retrieval; Learning systems; Pattern analysis; Pattern recognition; Speech analysis; Web pages; Extraction Pattern; Grammatical structure; LG parser; Stanford POS tagger;
Conference_Titel :
Computer Technology and Development, 2009. ICCTD '09. International Conference on
Conference_Location :
Kota Kinabalu
Print_ISBN :
978-0-7695-3892-1
DOI :
10.1109/ICCTD.2009.227