Title :
Experiment on a phrase-based statistical machine translation using PoS Tag information for Sundanese into Indonesian
Author :
Arie Ardiyanti Suryani;Dwi Hendratmo Widyantoro;Ayu Purwarianti;Yayat Sudaryat
Author_Institution :
Sekolah Teknik Elektro dan Informatika-ITB, Bandung, Indonesia
Abstract :
This paper discusses the problem of Sundanese into Indonesian text translation, as one of low-resource language pair translation. The number of parallel corpus gives a significant impact on a statistical machine translation. Whereas to date, there are no Sundanese to Indonesian parallel corpus that ready to use. It is, therefore, we apply the PoS Tag rather than only surface form in the translation model to get a better translation result. This experiment was done to get an early result in Sundanese to Indonesian text translation and to identify problems arise on it. The result shows that the model using surface form and PoS Tag was slightly outperformed the model using only surface form. However, there are some problems faced in this experiment which are the large number of OOV caused by the limited number of parallel corpus and unproper phrase translation caused by some noise in the parallel corpus such as typos and inconsistency writing a word in Sundanese corpus.
Keywords :
"Training","Mathematical model","Decoding","Probability","Writing","Pragmatics","Cleaning"
Conference_Titel :
Information Technology Systems and Innovation (ICITSI), 2015 International Conference on
Print_ISBN :
978-1-4673-6663-2
DOI :
10.1109/ICITSI.2015.7437678