• DocumentCode
    3037174
  • Title

    A hybrid model for word alignment with bilingual corpus

  • Author

    Liang, Chen ; Jinan, Xu ; Yujie, Zhang

  • Author_Institution
    Sch. of Comput. & Inf. Technol., Beijing Jiao Tong Univ., Beijing, China
  • Volume
    3
  • fYear
    2012
  • fDate
    25-27 May 2012
  • Firstpage
    99
  • Lastpage
    103
  • Abstract
    Word alignment is a key research in text information processing. In this paper, we propose a hybrid model of word alignment by combining IBM 5-models, Word Entropy model and Support Information model organically[1]. The sub-models of the Support Information model includes: Minimum Intersection model and Minimum Difference model. Researches indicate that IBM model could implement word alignment with high recall value but low precision result, while the Support Information model can bring low recall value but high precision result and the Word Entropy model could effectively reduce the affected noises from other words. So we think up an idea that combining and utilizing the advantages of these models. Experimental result of our hybrid model obtains 89.19% of the f-measure, 88.74% of the recall and 89.66% of the precision.
  • Keywords
    computational linguistics; text analysis; word processing; IBM 5-model; bilingual corpus; minimum difference model; minimum intersection model; support information model; text information processing; word alignment; word entropy model; Computational modeling; Data models; Educational institutions; Entropy; Hidden Markov models; NIST; Training data; IBM models; Support Information; Word Alignment; Word Entropy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on
  • Conference_Location
    Zhangjiajie
  • Print_ISBN
    978-1-4673-0088-9
  • Type

    conf

  • DOI
    10.1109/CSAE.2012.6272917
  • Filename
    6272917