Title :
An Approach to Word Sense Disambiguation in English-Vietnamese-English Statistical Machine Translation
Author :
Nguyen, Quy ; Nguyen, An ; Dinh, Dien
Author_Institution :
Fac. of Comput. Sci., Univ. of Inf. Technol., Ho Chi Minh City, Vietnam
fDate :
Feb. 27 2012-March 1 2012
Abstract :
The most difficult problem of machine translation (MT) in general and statistical machine translation (SMT) in particular is to select the correct meaning of the polysemous words. Their correct meaning mainly depends on the context and the topic of the text. Therefore, to improve the quality of SMT by resolving semantic ambiguity of words, we integrate more knowledge about the topic of the text, part-of-speech (POS) and morphology. We applied this model to English-Vietnamese- English SMT system and BLEU scores increased over 6% compared with the baseline general SMT system, which was not integrated information about the topic or other language knowledge.
Keywords :
language translation; natural language processing; text analysis; BLEU scores; English-Vietnamese-English statistical machine translation; morphology; part-of-speech; polysemous words; text context; text topic; word sense disambiguation; Buildings; Semantics; Support vector machines; Tagging; Testing; Training; Vectors;
Conference_Titel :
Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on
Conference_Location :
Ho Chi Minh City
Print_ISBN :
978-1-4673-0307-1
DOI :
10.1109/rivf.2012.6169839