Title :
A Semi-supervised Learning Method for Vietnamese Part-of-Speech Tagging
Author :
Le Minh Nguyen ; Xuan, Bach Ngo ; Viet, Cuong Nguyen ; Minh Pham Quang Nhat ; Shimazu, Akira
Author_Institution :
Sch. of Inf. Sci., JAIST, Ishikawa, Japan
Abstract :
This paper presents a semi-supervised learning method for Vietnamese part of speech tagging. We take into account two powerful tagging models including Conditional Random Fields (CRFs)and the Guided Online-Learning models (GLs) as base learning models. We then propose a semi-supervised learning tagging model for both CRFs and GLs methods. The main idea is to use of a word-cluster model as an associate source for enrich the feature space of discriminate learning models for both training and decoding processes. Experimental results on Vietnamese Tree-bank data (VTB) showed that the proposed method is effective. Our best model achieved accuracy of 94.10% when tested on VTB, and 92.60% an independent test.
Keywords :
learning (artificial intelligence); natural language processing; random processes; Vietnamese part-of-speech tagging; Vietnamese tree-bank data; conditional random fields; discriminate learning models; guided online-learning models; semi-supervised learning method; word-cluster model; Clustering algorithms; Context; Modeling; Speech; Tagging; Training; Training data; Conditional Random Fields; Guided Learning; Part of Speech tagging; Semi-Supervised Learning;
Conference_Titel :
Knowledge and Systems Engineering (KSE), 2010 Second International Conference on
Conference_Location :
Hanoi
Print_ISBN :
978-1-4244-8334-1
DOI :
10.1109/KSE.2010.35