DocumentCode
3037174
Title
A hybrid model for word alignment with bilingual corpus
Author
Liang, Chen ; Jinan, Xu ; Yujie, Zhang
Author_Institution
Sch. of Comput. & Inf. Technol., Beijing Jiao Tong Univ., Beijing, China
Volume
3
fYear
2012
fDate
25-27 May 2012
Firstpage
99
Lastpage
103
Abstract
Word alignment is a key research in text information processing. In this paper, we propose a hybrid model of word alignment by combining IBM 5-models, Word Entropy model and Support Information model organically[1]. The sub-models of the Support Information model includes: Minimum Intersection model and Minimum Difference model. Researches indicate that IBM model could implement word alignment with high recall value but low precision result, while the Support Information model can bring low recall value but high precision result and the Word Entropy model could effectively reduce the affected noises from other words. So we think up an idea that combining and utilizing the advantages of these models. Experimental result of our hybrid model obtains 89.19% of the f-measure, 88.74% of the recall and 89.66% of the precision.
Keywords
computational linguistics; text analysis; word processing; IBM 5-model; bilingual corpus; minimum difference model; minimum intersection model; support information model; text information processing; word alignment; word entropy model; Computational modeling; Data models; Educational institutions; Entropy; Hidden Markov models; NIST; Training data; IBM models; Support Information; Word Alignment; Word Entropy;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on
Conference_Location
Zhangjiajie
Print_ISBN
978-1-4673-0088-9
Type
conf
DOI
10.1109/CSAE.2012.6272917
Filename
6272917
Link To Document