شماره ركورد كنفرانس :
3540
عنوان مقاله :
Word-Level Confidence Estimation for Statistical Machine Translation using IBM-1 Model
Author/Authors :
Mohammad Mahdi Mahsuli Human Language Technology Lab - Department of Computer Engineering and Information Technology - Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran , Shahram Khadivi Human Language Technology Lab - Department of Computer Engineering and Information Technology - Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran
كليدواژه :
translation error rate , IBM-1 model , machine translation , confidence measure , confidence estimation , natural language processing
سال انتشار :
1392
عنوان كنفرانس :
همايش بين المللي هوش مصنوعي و پردازش سيگنال
زبان مدرك :
لاتين
چكيده لاتين :
Confidence estimation for machine translation is a method for label-ing each word in a machine translation system‟s output as “correct” or “incor-rect”. In this paper, we will present new confidence measures based on IBM-1 model which have the advantage that unlike many other confidence measures, they do not rely on system output such as N-best lists or word graphs. In addi-tion, they are very low-cost to calculate. Therefore these confidence measures are applicable to any kind of machine translation system. Experiments have been performed on translation of news lines in English-Farsi language pair. The performance of the new confidence measures is better than similar existing con-fidence measures. Moreover, we will introduce a method to tag unlabeled train-ing samples. This method - which has given promising results in machine trans-lation, but not yet used in confidence estimation - is called translation error rate.
كشور :
ايران
تعداد صفحه 2 :
9
از صفحه :
1
تا صفحه :
9
لينک به اين مدرک :
بازگشت