Title :
Bilingual Phrase Extraction from N-Best Alignments
Author :
Xue, Yong-zeng ; Li, Sheng ; Zhao, Tie-jun ; Yang, Mu-yun ; Li, Jun
Author_Institution :
MOE-MS Key Lab. of Nat. Language Process. & Speech, Harbin Inst. of Technol.
fDate :
Aug. 30 2006-Sept. 1 2006
Abstract :
Improved approach of phrase extraction was proposed for phrase-based statistical machine translation. The effectiveness was investigated when using n-best alignments instead of one-best for phrase extraction. Bilingual phrase pairs were extracted in the presented approach by combining word-to-word links from n-best alignments between source and target sentences. First, the n-best alignments were divided into hierarchies by frequencies of word co-occurrence. Second, candidates of phrase pairs were extracted from each layer. Experimental results show that the presented approach outperforms the baseline system Pharaoh in both NIST and BLEU scores. Therefore it is effective to use n-best alignments as an extension to one-best alignment for phrase extraction
Keywords :
computational linguistics; language translation; linguistics; BLEU scores; NIST; Pharaoh baseline system; bilingual phrase extraction; bilingual phrase pairs; n-best alignments; phrase-based statistical machine translation; word cooccurrence; word-to-word links; Computer science; Context modeling; Data mining; Encoding; Error analysis; Frequency conversion; Laboratories; NIST; Natural languages; Speech processing;
Conference_Titel :
Innovative Computing, Information and Control, 2006. ICICIC '06. First International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7695-2616-0
DOI :
10.1109/ICICIC.2006.426