Title :
A Hybrid Machine Learning Approach for Information Extraction
Author :
Silva, Eduardo F A ; Barros, Flavia A. ; Prudêncio, Ricardo B C
Author_Institution :
Federal University of Pernambuco, Brazil
Abstract :
Information Extraction (IE) aims to extract from textual documents only the relevant data required by the user. In this paper, we propose a hybrid machine learning approach for IE on semi-structured texts that combines conventional text classification techniques and Hidden Markov Models (HMM). In this approach, a text classifier technique generates an initial output, which is refined by an HMM, providing a globally optimal extraction. An implemented prototype was used to extract information from bibliographic references, reaching a consistent gain in performance through the use of the HMM.
Conference_Titel :
Hybrid Intelligent Systems, 2006. HIS '06. Sixth International Conference on
Conference_Location :
Rio de Janeiro, Brazil
Print_ISBN :
0-7695-2662-4
DOI :
10.1109/HIS.2006.264927