Title :
Linguistically Motivated and Ontological Features for Vietnamese Named Entity Recognition
Author :
Nguyen, Truc-Vien T. ; Cao, Tru H.
Author_Institution :
Center for Mind/Brain Sci. (CIMeC), Univ. of Trento, Rovereto, Italy
fDate :
Feb. 27 2012-March 1 2012
Abstract :
In this paper, we provide a deep analysis on the effect of linguistic features and ontological features for the Vietnamese named entity recognition (NER) task. Plugged in into an off-the-shelf learning framework, we show that, simple lexical words and bi-gram features allow to encode dependencies amongst possible NE labels in Vietnamese language. Results achieved on a standard annotated corpus support our claim, with an accuracy comparable to the state-of-the-art without any external resource. Moreover, when augmented with ontological features from a large knowledge base, the results in both flat and structured classification are almost competitive. Our finding exhibits interesting aspects of linguistically motivated features, including contextual and syntactic patterns for Vietnamese language. Additionally, results achieved with ontological features show that, they can be used to learn as specific as needed, resulting in the first high-performance Vietnamese structured NER system.
Keywords :
computational linguistics; knowledge based systems; learning (artificial intelligence); natural language processing; ontologies (artificial intelligence); NE labels; Vietnamese language contextual patterns; Vietnamese language syntactic patterns; Vietnamese named entity recognition; Vietnamese structured NER system; bigram features; knowledge base system; lexical words; linguistic features; linguistic motivation; off-the-shelf learning framework; ontological features; Accuracy; Hidden Markov models; Logic gates; Measurement; Ontologies; Organizations; Robustness;
Conference_Titel :
Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on
Conference_Location :
Ho Chi Minh City
Print_ISBN :
978-1-4673-0307-1
DOI :
10.1109/rivf.2012.6169818