DocumentCode :
3637224
Title :
Statistical machine translation of Croatian weather forecast: How much data do we need?
Author :
Nikola Ljubešić;Petra Bago;Damir Boras
Author_Institution :
Department of Information Sciences, Faculty of Humanities and Social Sciences, Ivana Luč
fYear :
2010
Firstpage :
91
Lastpage :
96
Abstract :
This research is a first step towards a system for translating Croatian weather forecast into multiple languages. This steps deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic consisting of 7,893 sentence pairs. Evaluation is performed by best known automatic evaluation measures BLUE, NIST and METEOR, as well as by evaluating manually a sample of 200 translations. In this research we have shown that with a small-sized training set and the state-of-the art Moses system, decoding can be done with 96% accuracy concerning adequacy and fluency. Additional improvement is to be expected by increasing the training set size.
Keywords :
"Humans","NIST","Training","Correlation","Size measurement","Weather forecasting"
Publisher :
ieee
Conference_Titel :
Information Technology Interfaces (ITI), 2010 32nd International Conference on
ISSN :
1330-1012
Print_ISBN :
978-1-4244-5732-8
Type :
conf
Filename :
5546371
Link To Document :
بازگشت