DocumentCode
3318290
Title
A maximum entropy-based sentence simplifier for machine translation
Author
Finch, Andrew ; Shimohata, Mitsuo ; Sumita, Eiichiro
Author_Institution
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
fYear
2005
fDate
30 Oct.-1 Nov. 2005
Firstpage
646
Lastpage
650
Abstract
We present a method for removing unnecessary words from sentences to expedite automatic machine translation. Our hypothesis is that the resulting simplified sentences are easier to automatically translate, giving improved translation performance. We evaluate the sentence simplifier in two ways. Firstly the system is tested directly against humans in the word deletion task. The output of our system is evaluated against a set of reference sentences and its performance compared to a test set of human-shortened sentences. We show the system is able to perform at close to human performance on this task. Secondly we evaluate the system when used as a preprocessor to two different machine translation systems. We show that we are able to significantly improve the performance of a machine translation (MT) system based on the publicly available GIZA++ software by pre-processing the input, and make a small improvement to the performance of the more capable ATR translation system.
Keywords
language translation; maximum entropy methods; natural languages; ATR translation system; GIZA++ software; automatic machine translation system; maximum entropy-based sentence simplifier; reference sentences; word deletion task; Cities and towns; Entropy; History; Humans; Laboratories; Natural language processing; Natural languages; Software performance; Speech; System testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN
0-7803-9361-9
Type
conf
DOI
10.1109/NLPKE.2005.1598816
Filename
1598816
Link To Document