Title :
Class-based named entity translation in a speech to speech translation system
Author :
Sameer R. Maskey;Martin Cmejrek;Bowen Zhou;Yuqing Gao
Author_Institution :
IBM T.J. Watson Research Center, Yorktown Heights, New York, USA
Abstract :
Named Entity (NE) Translation is a challenging problem in Machine Translation (MT). Most of the training bi-text corpora for MT lack enough samples of NEs to cover the wide variety of contexts NEs can appear in. In this paper, we present a technique to translate NEs based on their NE types in addition to a phrase-based translation model. Our NE translation model is based on a syntax-based system similar to [1]; but we produce syntax-based rules with non-terminals as NE types instead of general non-terminals. Such classbased rules allow us to better generalize the context NEs. We show that our proposed method obtains an improvement of 0.66 BLEU score absolute as well as 0.26% in F1-measure over the baseline of phrase-based model in NE test set.
Keywords :
"Engines","Decoding","Speech coding","Training data","Surface-mount technology","Testing","Statistical analysis","Probability","Robustness","Context modeling"
Conference_Titel :
Spoken Language Technology Workshop, 2008. SLT 2008. IEEE
Print_ISBN :
978-1-4244-3471-8
DOI :
10.1109/SLT.2008.4777888