DocumentCode :
2379126
Title :
A dependency-based word reordering approach for Statistical Machine Translation
Author :
Hoang, Vu ; Ngo, Mai ; Dinh, Dien
Author_Institution :
Fac. of Inf. Technol., Univ. of Sci., Ho Chi Minh City
fYear :
2008
fDate :
13-17 July 2008
Firstpage :
120
Lastpage :
127
Abstract :
Reordering is of crucial importance for machine translation. Solving the reordering problem can lead to remarkable improvements in translation performance. In this paper, we propose a novel approach to solve the word reordering problem in statistical machine translation. We rely on the dependency relations retrieved from a statistical parser incorporating with linguistic hand-crafted rules to create the transformations. These dependency-based transformations can produce the problem of word movement on both phrase and word reordering which is a difficult problem on parse tree based approaches. Such transformations are then applied as a preprocessor to English language both in training and decoding process to obtain an underlying word order closer to the Vietnamese language. About the hand-crafted rules, we extract from the syntactic differences of word order between English and Vietnamese language. This approach is simple and easy to implement with a small rule set, not lead to the rule explosion. We describe the experiments using our model on VCLEVC corpus [18] and consider the translation from English to Vietnamese, showing significant improvements about 2-4% BLEU score in comparison with the MOSES phrase-based baseline system [19].
Keywords :
computational linguistics; grammars; language translation; natural language processing; trees (mathematics); word processing; English language; MOSES phrase-based baseline system; VCLEVC corpus; Vietnamese language; dependency-based word reordering approach; linguistic hand-crafted rules; parse tree; statistical machine translation; statistical parser; syntactic differences; Cities and towns; Context modeling; Data preprocessing; Decoding; Explosions; Information technology; Natural language processing; Natural languages; Power system modeling; Surface-mount technology; Natural language processing; dependency parser; preprocessing; statistical machine translation; transformation; word reordering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on
Conference_Location :
Ho Chi Minh City
Print_ISBN :
978-1-4244-2379-8
Electronic_ISBN :
978-1-4244-2380-4
Type :
conf
DOI :
10.1109/RIVF.2008.4586343
Filename :
4586343
Link To Document :
بازگشت