DocumentCode :
1694031
Title :
Designing a Multi-dimensional Space for Hybrid Information Extraction
Author :
Feilmayr, Christina ; Vojinovic, Klaudija ; Pröll, Birgit
Author_Institution :
Inst. of Applic. Oriented Knowledge Process. (FAW), Johannes Kepler Univ. Linz, Linz, Austria
fYear :
2012
Firstpage :
121
Lastpage :
125
Abstract :
Information extraction systems are developed for various specific application domains to manage an increasing amount of unstructured data. The majority build either upon the knowledge-based approach, which promises high accuracy but involves labour-intensive coding of extraction rules, or upon the automatically trainable systems approach, which produces highly portable solutions but requires an appropriate learning set. In this paper, we present results of a project that aims to provide a new methodology which combines the knowledge-based and the machine learning approach into a hybrid one in order to compensate for their respective shortcomings and to achieve high IE performance. Firstly, we propose the idea of a multi-dimensional space that guides users in selecting appropriate methods, i.e., different hybrid concepts, depending on the extraction task and the level of available features. Secondly, we provide the concept of one hybrid approach, namely the sequential processing of a knowledge-based approach and a selection of different machine learning methods. Thirdly, we present the evaluation of an implementation of the sequential extraction on a curriculum vitae corpus. Thus, we provide first results for filling the multi-dimensional space for hybrid information extraction.
Keywords :
information retrieval; learning (artificial intelligence); IE performance; curriculum vitae corpus; extraction rules coding; hybrid information extraction system; knowledge-based approach sequential processing; machine learning approach; multidimensional space design; sequential extraction; trainable systems approach; Data mining; Feature extraction; Information retrieval; Knowledge based systems; Learning systems; Machine learning; Training; (Statistical) Machine Learning; Extraction Methodology; Hybrid Information Extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database and Expert Systems Applications (DEXA), 2012 23rd International Workshop on
Conference_Location :
Vienna
ISSN :
1529-4188
Print_ISBN :
978-1-4673-2621-6
Type :
conf
DOI :
10.1109/DEXA.2012.34
Filename :
6327413
Link To Document :
بازگشت