DocumentCode
2969797
Title
A Hybrid Machine Learning Approach for Information Extraction
Author
Silva, Eduardo F A ; Barros, Flavia A. ; Prudêncio, Ricardo B C
Author_Institution
Federal University of Pernambuco, Brazil
fYear
2006
fDate
Dec. 2006
Firstpage
44
Lastpage
44
Abstract
Information Extraction (IE) aims to extract from textual documents only the relevant data required by the user. In this paper, we propose a hybrid machine learning approach for IE on semi-structured texts that combines conventional text classification techniques and Hidden Markov Models (HMM). In this approach, a text classifier technique generates an initial output, which is refined by an HMM, providing a globally optimal extraction. An implemented prototype was used to extract information from bibliographic references, reaching a consistent gain in performance through the use of the HMM.
fLanguage
English
Publisher
ieee
Conference_Titel
Hybrid Intelligent Systems, 2006. HIS '06. Sixth International Conference on
Conference_Location
Rio de Janeiro, Brazil
Print_ISBN
0-7695-2662-4
Type
conf
DOI
10.1109/HIS.2006.264927
Filename
4041424
Link To Document