Title :
An Integrated Approach Using Conditional Random Fields for Named Entity Recognition and Person Property Extraction in Vietnamese Text
Author :
Le, Hoang-Quynh ; Tran, Mai-Vu ; Bui, Nhat-Nam ; Phan, Nguyen-Cuong ; Ha, Quang-Thuy
Author_Institution :
Coll. of Technol., KTLab, Vietnam Nat. Univ., Hanoi, Hanoi, Vietnam
Abstract :
Personal names are among one of the most frequently searched items in web search engines and a person entity is always associated with numerous properties. In this paper, we propose an integrated model to recognize person entity and extract relevant values of a pre-defined set of properties related to this person simultaneously for Vietnamese. We also design a rich feature set by using various kind of knowledge resources and a apply famous machine learning method CRFs to improve the results. The obtained results show that our method is suitable for Vietnamese with the average result is 84 % of precision, 82.56% of recall and 83.39 % of F-measure. Moreover, performance time is pretty good, and the results also show the effectiveness of our feature set.
Keywords :
learning (artificial intelligence); search engines; text analysis; Vietnamese text; Web search engine; conditional random field; integrated model; knowledge resource; machine learning method; named entity recognition; person entity recognition; person property extraction; Data mining; Dictionaries; Feature extraction; Labeling; Tagging; Text recognition; Training; conditional random fields; person named entity; person property extraction; property extraction; property relation;
Conference_Titel :
Asian Language Processing (IALP), 2011 International Conference on
Conference_Location :
Penang
Print_ISBN :
978-1-4577-1733-8
DOI :
10.1109/IALP.2011.37