DocumentCode :
2379143
Title :
Improving Vietnamese POS tagging by integrating a rich feature set and Support Vector Machines
Author :
Nghiem, Minh ; Dinh, Dien ; Nguyen, Mai
Author_Institution :
Fac. of Inf. Technol., Univ. of Sci., Ho Chi Minh City
fYear :
2008
fDate :
13-17 July 2008
Firstpage :
128
Lastpage :
133
Abstract :
Part of speech (POS) tagging is fundamental in natural language processing. So far, many methods have been applied for English and the task is well solved. However, there are few studies about this problem for Vietnamese. In this paper, we evaluate common features for English POS tagging and then propose some language specific features for Vietnamese POS tagging. Experimental results on the Vietnamese Lexicography Centerpsilas research grouppsilas corpus show that our POS tagger using this feature set trained by SVM outperforms other Vietnamese POS taggers.
Keywords :
natural language processing; speech processing; support vector machines; Vietnamese language; language specific features; natural language processing; part of speech tagging; rich feature set; support vector machines; Feature extraction; Hidden Markov models; Information technology; Machine learning; Natural language processing; Natural languages; Speech processing; Support vector machine classification; Support vector machines; Tagging; Natural Language Processing; Part of Speech Tagging; Support Vector Machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on
Conference_Location :
Ho Chi Minh City
Print_ISBN :
978-1-4244-2379-8
Electronic_ISBN :
978-1-4244-2380-4
Type :
conf
DOI :
10.1109/RIVF.2008.4586344
Filename :
4586344
Link To Document :
بازگشت