Title :
Concepts extraction from unstructured Polish texts: A rule based approach
Author_Institution :
AGH University of Science and Technology, Poland
Abstract :
We present recently developed solution allowing extraction of concepts from unstructured Polish texts with special focus on correct morphological forms of obtained concept names. As Polish is a highly inflected language, detected names need to be transformed following Polish grammar rules. We propose a user-friendly method for specification of transformation patterns, which is based on a simple annotations language. Annotations prepared by a user are compiled into transformation rules. During the concept extraction process the input document is split into sentences and the rules are applied to sequences of words comprised in sentences. Recognized strings forming concept names are aggregated at various levels and assigned with scores. We report also results of initial experiments performed on a medical text.
Keywords :
"Dictionaries","Speech","Compounds","Grammar","Libraries","Feature extraction","Ontologies"
Conference_Titel :
Computer Science and Information Systems (FedCSIS), 2015 Federated Conference on