DocumentCode :
11499
Title :
Exploring Diverse Features for Statistical Machine Translation Model Pruning
Author :
Mei Tu ; Yu Zhou ; Chengqing Zong
Author_Institution :
Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Volume :
23
Issue :
11
fYear :
2015
fDate :
Nov. 2015
Firstpage :
1847
Lastpage :
1857
Abstract :
In phrase-based and hierarchical phrase-based statistical machine translation systems, translation performance depends heavily on the size and quality of the translation table. To meet the requirements of making a real-time response, some research has been performed to filter the translation table. However, most existing methods are always based on one or two constraints that act as hard rules, such as not allowing phrase-pairs with low translation probabilities. These approaches sometimes make constraints rigid because they consider only a single factor instead of composite factors. Based on the considerations above, in this paper, we propose a machine learning-based framework that integrates multiple features for translation model pruning. Experimental results show that our framework is effective by pruning 80% of the phrase-pairs and 70% of the hierarchical rules, while retaining the quality of the translation models when using the BLEU evaluation metric. Our study further shows that our method can select the most useful phrase-pairs and rules, including those that are low in frequency but still very useful.
Keywords :
filtering theory; language translation; learning (artificial intelligence); BLEU evaluation metric; composite factors; diverse features; hard rules; hierarchical phrase; machine learning; phrase-pairs; real-time response; statistical machine translation model pruning; translation table filter; Bidirectional control; Data models; Decoding; IEEE transactions; Syntactics; Training; Training data; Classification; statistical machine translation (SMT); syntactic constraints; translation model pruning;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2015.2456413
Filename :
7156075
Link To Document :
بازگشت