• DocumentCode
    3767516
  • Title

    Exploring hybrid character-words representational unit in Classical-to-Modern Chinese machine translation

  • Author

    Hongyang Zhang; Muyun Yang; Tiejun Zhao

  • Author_Institution
    Machine Intelligence & Translation Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, China
  • fYear
    2015
  • Firstpage
    33
  • Lastpage
    36
  • Abstract
    This paper investigates hybrid representational unit in statistical machine translation from Classical to Modern Chinese where the basic unit of Modern Chinese is mixture of Chinese characters and words while characters unit for Classical Chinese. We explore several approaches to hybrid the characters and words in SMT. the best method achieves gains of 0.33 BLEU points or 1.2% relative over the best SMT baseline system which is modeled by different representational granularities. Further more, we find changing distortion limit in SMT has a relatively small effect on enhancing the quality of our hybrid character-words unit system.
  • Keywords
    "Zirconium","Merging","Hidden Markov models"
  • Publisher
    ieee
  • Conference_Titel
    Asian Language Processing (IALP), 2015 International Conference on
  • Print_ISBN
    978-1-4673-9595-3
  • Type

    conf

  • DOI
    10.1109/IALP.2015.7451525
  • Filename
    7451525