DocumentCode :
1625618
Title :
A pool-based active learning method for improving Farsi-English Machine Translation system
Author :
Bakhshaei, Somayeh ; Khadivi, Shahram
Author_Institution :
Comput. Sci. & Inf. Theor. Dept., Amirkabir Univ. of Technol., Tehran, Iran
fYear :
2012
Firstpage :
822
Lastpage :
826
Abstract :
In this paper we try to alleviate the problem of scares resources for developing Farsi-English Statistical Machine Translation system (SMT). It is done by applying Active Learning (AL) idea to choose more informative sentences to be translated by a human and then be added to the base-line corpus. While using the human translations is worthless in compare to the other approaches of corpus gathering (like automatic approaches), it is more costly too. So, in this way we can improve the translation system with less cost. This is done in intricate to human translator. Applying Active learning idea to a SMT system, changes it to a system which can improve its based-line corpus by asking for the essential data which directly leads to the system improvement. On the other hand, combination of AL idea with SMT is a way of using source side monolingual resources for improving SMT systems which is ignored in the original theory of SMT. Our results for Farsi-English system shows improvement in compare to random sentence selection.
Keywords :
language translation; learning (artificial intelligence); Farsi-English machine translation system; SMT system; base-line corpus; human translations; human translator; informative sentences; pool-based active learning method; random sentence selection; source side monolingual resources; Current measurement; Data models; Face; Feature extraction; Mathematical model; Uncertainty; Active Learning; Farsi-English SMT; Persian language; Scarece resources;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Telecommunications (IST), 2012 Sixth International Symposium on
Conference_Location :
Tehran
Print_ISBN :
978-1-4673-2072-6
Type :
conf
DOI :
10.1109/ISTEL.2012.6483099
Filename :
6483099
Link To Document :
بازگشت