DocumentCode :
3037174
Title :
A hybrid model for word alignment with bilingual corpus
Author :
Liang, Chen ; Jinan, Xu ; Yujie, Zhang
Author_Institution :
Sch. of Comput. & Inf. Technol., Beijing Jiao Tong Univ., Beijing, China
Volume :
3
fYear :
2012
fDate :
25-27 May 2012
Firstpage :
99
Lastpage :
103
Abstract :
Word alignment is a key research in text information processing. In this paper, we propose a hybrid model of word alignment by combining IBM 5-models, Word Entropy model and Support Information model organically[1]. The sub-models of the Support Information model includes: Minimum Intersection model and Minimum Difference model. Researches indicate that IBM model could implement word alignment with high recall value but low precision result, while the Support Information model can bring low recall value but high precision result and the Word Entropy model could effectively reduce the affected noises from other words. So we think up an idea that combining and utilizing the advantages of these models. Experimental result of our hybrid model obtains 89.19% of the f-measure, 88.74% of the recall and 89.66% of the precision.
Keywords :
computational linguistics; text analysis; word processing; IBM 5-model; bilingual corpus; minimum difference model; minimum intersection model; support information model; text information processing; word alignment; word entropy model; Computational modeling; Data models; Educational institutions; Entropy; Hidden Markov models; NIST; Training data; IBM models; Support Information; Word Alignment; Word Entropy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on
Conference_Location :
Zhangjiajie
Print_ISBN :
978-1-4673-0088-9
Type :
conf
DOI :
10.1109/CSAE.2012.6272917
Filename :
6272917
Link To Document :
بازگشت