• DocumentCode
    172492
  • Title

    A maximum entropy based reordering model for Mongolian-Chinese SMT with morphological information

  • Author

    Zhenxin Yang ; Miao Li ; Zede Zhu ; Lei Chen ; Linyu Wei ; Shaoqi Wang

  • Author_Institution
    Inst. of Intell. Machines, Hefei, China
  • fYear
    2014
  • fDate
    20-22 Oct. 2014
  • Firstpage
    175
  • Lastpage
    178
  • Abstract
    Different order between Mongolian and Chinese and the scarcity of parallel corpus are the main problems in Mongolian-Chinese statistical machine translation (SMT). We propose a method that adopts morphological information as the features of the maximum entropy based phrase reordering model for Mongolian-Chinese SMT. By taking advantage of the Mongolian morphological information, we add Mongolian stem and affix as phrase boundary information and use a maximum entropy model to predict reordering of neighbor blocks. To some extent, our method can alleviate the influence of reordering caused by the data sparseness. In addition, we further add part-of-speech (POS) as the features in the reordering model. Experiments show that the approach outperforms the maximum entropy model using only boundary words information and provides a maximum improvement of 0.8 BLEU score increment over baseline.
  • Keywords
    language translation; maximum entropy methods; natural language processing; BLEU score; Mongolian affix; Mongolian stem; Mongolian-Chinese SMT; Mongolian-Chinese statistical machine translation; POS; boundary words information; data sparseness; maximum entropy; morphological information; parallel corpus; part-of-speech; phrase boundary information; phrase reordering model; reordering prediction; Decoding; Educational institutions; Entropy; Feature extraction; Morphology; Pragmatics; Training; machine translation; maximum entropy; morphological; reordering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asian Language Processing (IALP), 2014 International Conference on
  • Conference_Location
    Kuching
  • Type

    conf

  • DOI
    10.1109/IALP.2014.6973484
  • Filename
    6973484