• DocumentCode
    2910254
  • Title

    An Integrated Approach Using Conditional Random Fields for Named Entity Recognition and Person Property Extraction in Vietnamese Text

  • Author

    Le, Hoang-Quynh ; Tran, Mai-Vu ; Bui, Nhat-Nam ; Phan, Nguyen-Cuong ; Ha, Quang-Thuy

  • Author_Institution
    Coll. of Technol., KTLab, Vietnam Nat. Univ., Hanoi, Hanoi, Vietnam
  • fYear
    2011
  • fDate
    15-17 Nov. 2011
  • Firstpage
    115
  • Lastpage
    118
  • Abstract
    Personal names are among one of the most frequently searched items in web search engines and a person entity is always associated with numerous properties. In this paper, we propose an integrated model to recognize person entity and extract relevant values of a pre-defined set of properties related to this person simultaneously for Vietnamese. We also design a rich feature set by using various kind of knowledge resources and a apply famous machine learning method CRFs to improve the results. The obtained results show that our method is suitable for Vietnamese with the average result is 84 % of precision, 82.56% of recall and 83.39 % of F-measure. Moreover, performance time is pretty good, and the results also show the effectiveness of our feature set.
  • Keywords
    learning (artificial intelligence); search engines; text analysis; Vietnamese text; Web search engine; conditional random field; integrated model; knowledge resource; machine learning method; named entity recognition; person entity recognition; person property extraction; Data mining; Dictionaries; Feature extraction; Labeling; Tagging; Text recognition; Training; conditional random fields; person named entity; person property extraction; property extraction; property relation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asian Language Processing (IALP), 2011 International Conference on
  • Conference_Location
    Penang
  • Print_ISBN
    978-1-4577-1733-8
  • Type

    conf

  • DOI
    10.1109/IALP.2011.37
  • Filename
    6121483