Title :
Bilingual chunk alignment in statistical machine translation
Author :
Zhou, Yu ; Zong, Chengqing ; Xu, Bo
Author_Institution :
Inst. of Autom., Chinese Acad. of Sci., Beijing, China
Abstract :
In this paper a new algorithm called multilayer filtering (MLF) is proposed for extracting bilingual alignment chunks automatically from a Chinese-English parallel corpus. Multiple layers are used to extract bilingual chunks according to different features of chunks in the bilingual corpus. And the alignment chunks are one-to-one corresponding with each other. The chunking and alignment algorithm doesn´t rely on the information from tagging, parsing, syntax analyzing or segmenting for Chinese corpus as most conventional algorithms do. Preliminary experimental results show that the algorithm achieves a good performance in chunking and alignment. Moreover, the translations generated by this algorithm are much better than the results generated by the baseline (word-based statistical machine translation).
Keywords :
language translation; natural languages; statistical analysis; Chinese-English parallel corpus; bilingual chunk alignment; multilayer filtering; syntax analyzing; word-based statistical machine translation; Automation; Data mining; Filtering algorithms; Performance analysis; Robustness; Speech; Statistical analysis; Surface-mount technology; Tagging; Training data;
Conference_Titel :
Systems, Man and Cybernetics, 2004 IEEE International Conference on
Print_ISBN :
0-7803-8566-7
DOI :
10.1109/ICSMC.2004.1399826