• DocumentCode
    36449
  • Title

    Graph-based lexicalized reordering models for statistical machine translation

  • Author

    Su Jinsong ; Liu Yang ; Liu Qun ; Dong Huailin

  • Author_Institution
    Xiamen Univ., Xiamen, China
  • Volume
    11
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    71
  • Lastpage
    82
  • Abstract
    Lexicalized reordering models are very important components of phrase-based translation systems. By examining the reordering relationships between adjacent phrases, conventional methods learn these models from the word aligned bilingual corpus, while ignoring the effect of the number of adjacent bilingual phrases. In this paper, we propose a method to take the number of adjacent phrases into account for better estimation of reordering models. Instead of just checking whether there is one phrase adjacent to a given phrase, our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence, then the effect of the adjacent phrase number can be quantified in a forward-backward fashion, and finally incorporated into the estimation of reordering models. Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.
  • Keywords
    graph theory; language translation; natural language processing; text analysis; NIST Chinese-English data sets; WMT French-Spanish data sets; compact structure named reordering graph; graph-based lexicalized reordering model; parallel sentence; phrase segmentations; phrase-based translation systems; statistical machine translation; Analytical models; Computational modeling; Decoding; Machine learning; Natural language processing; Predictive models; Statistical analysis; lexicalized reordering model; natural language processing; reordering graph; statistical machine translation;
  • fLanguage
    English
  • Journal_Title
    Communications, China
  • Publisher
    ieee
  • ISSN
    1673-5447
  • Type

    jour

  • DOI
    10.1109/CC.2014.6880462
  • Filename
    6880462