• DocumentCode
    2969797
  • Title

    A Hybrid Machine Learning Approach for Information Extraction

  • Author

    Silva, Eduardo F A ; Barros, Flavia A. ; Prudêncio, Ricardo B C

  • Author_Institution
    Federal University of Pernambuco, Brazil
  • fYear
    2006
  • fDate
    Dec. 2006
  • Firstpage
    44
  • Lastpage
    44
  • Abstract
    Information Extraction (IE) aims to extract from textual documents only the relevant data required by the user. In this paper, we propose a hybrid machine learning approach for IE on semi-structured texts that combines conventional text classification techniques and Hidden Markov Models (HMM). In this approach, a text classifier technique generates an initial output, which is refined by an HMM, providing a globally optimal extraction. An implemented prototype was used to extract information from bibliographic references, reaching a consistent gain in performance through the use of the HMM.
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hybrid Intelligent Systems, 2006. HIS '06. Sixth International Conference on
  • Conference_Location
    Rio de Janeiro, Brazil
  • Print_ISBN
    0-7695-2662-4
  • Type

    conf

  • DOI
    10.1109/HIS.2006.264927
  • Filename
    4041424