DocumentCode :
2660152
Title :
Better statistical estimation can benefit all phrases in phrase-based statistical machine translation
Author :
Sima´an, Khalil ; Mylonakis, Markos
Author_Institution :
Inst. for Logic, Univ. of Amsterdam, Amsterdam
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
237
Lastpage :
240
Abstract :
The heuristic estimates of conditional phrase translation probabilities are based on frequency counts in a word-aligned parallel corpus. Earlier attempts at more principled estimation using Expectation-Maximization (EM) under perform this heuristic. This paper shows that a recently introduced novel estimator based on smoothing might provide a good alternative. When all phrase pairs are estimated (no length cut-off), this estimator slightly outperforms the heuristic estimator.
Keywords :
expectation-maximisation algorithm; language translation; smoothing methods; conditional phrase translation probabilities; expectation-maximization; phrase-based statistical machine translation; smoothing methods; statistical estimation; word-aligned parallel corpus; Concurrent computing; Containers; Data mining; Frequency estimation; Logic; Parameter estimation; Probability; Smoothing methods; State estimation; Training data; Parameter Estimation; Smoothing Methods; Transduction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop, 2008. SLT 2008. IEEE
Conference_Location :
Goa
Print_ISBN :
978-1-4244-3471-8
Electronic_ISBN :
978-1-4244-3472-5
Type :
conf
DOI :
10.1109/SLT.2008.4777884
Filename :
4777884
Link To Document :
بازگشت