DocumentCode
3767516
Title
Exploring hybrid character-words representational unit in Classical-to-Modern Chinese machine translation
Author
Hongyang Zhang; Muyun Yang; Tiejun Zhao
Author_Institution
Machine Intelligence & Translation Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, China
fYear
2015
Firstpage
33
Lastpage
36
Abstract
This paper investigates hybrid representational unit in statistical machine translation from Classical to Modern Chinese where the basic unit of Modern Chinese is mixture of Chinese characters and words while characters unit for Classical Chinese. We explore several approaches to hybrid the characters and words in SMT. the best method achieves gains of 0.33 BLEU points or 1.2% relative over the best SMT baseline system which is modeled by different representational granularities. Further more, we find changing distortion limit in SMT has a relatively small effect on enhancing the quality of our hybrid character-words unit system.
Keywords
"Zirconium","Merging","Hidden Markov models"
Publisher
ieee
Conference_Titel
Asian Language Processing (IALP), 2015 International Conference on
Print_ISBN
978-1-4673-9595-3
Type
conf
DOI
10.1109/IALP.2015.7451525
Filename
7451525
Link To Document