Title :
A Japanese preprocessor for syntactic and semantic parsing
Author :
Kitani, Tsuyoshi ; Mitamura, Teruko
Author_Institution :
Center for Machine Translation, Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
The authors describe a Japanese preprocessor which includes a morphological analyzer called MAJESTY, and a proper noun identification program. The original morphological analyzer was modified to disambiguate its output when multiple possibilities for segmentations and parts of speech are found. Ambiguous segments are packed locally in the output enabling a syntactic and semantic parser to perform efficiently. Then the proper noun identification program groups several segments constructing a proper noun to present a meaningful set of segments to the parser. Tested on financial news articles, the preprocessor successfully segmented text and tagged parts of speech with greater than 98% accuracy. Over 80% of company names and 90% of personal and place names have been identified
Keywords :
grammars; natural languages; speech processing; Japanese preprocessor; MAJESTY; ambiguous segmentation; company names; financial news articles; morphological analyzer; place names; proper noun identification program; semantic parsing; syntactic parsing; tagged parts of speech; Artificial intelligence; Data analysis; Data mining; Finance; Morphology; Natural language processing; Natural languages; Performance analysis; Speech analysis; Testing;
Conference_Titel :
Artificial Intelligence for Applications, 1993. Proceedings., Ninth Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
0-8186-3840-0
DOI :
10.1109/CAIA.1993.366657