Title :
Improving Chinese-English patent machine translation using sentence segmentation
Author :
Jin, Yaohong ; Liu, Zhiying
Author_Institution :
Inst. of Chinese Inf. Process., Beijing Normal Univ., Beijing, China
Abstract :
This paper presents a method using sentence segmentation to improve the performance of Chinese-English patent machine translation. In this method, long Chinese sentence was segmented into separated short sentences using some features from the Hierarchical Network of Concepts theory (HNC theory). Some semantic features are introduced, including main verb of CSC (Eg), main verb of CSP (Egp), long NPs and conjunctions. The main purpose of segmentation algorithm is to detect if one CSC can or cannot be a separate sentence. The segmentation method was integrated with a rule-base MT system. The sequence of these short translations was adjusted and the different ways of expressions in both Chinese and English languages also were in consideration. From the result of the experiments, we can see that the performance of the Chinese-English patent translation was improved effectively. Our method had been integrated into an online patent MT system running in SIPO.
Keywords :
language translation; natural language processing; patents; word processing; Chinese-English patent machine translation; HNC theory; Hierarchical Network of Concepts theory; SIPO; online patent machine translation system; rule base machine translation system; sentence segmentation; short sentence translations sequence; Buildings; Google; Machine Translation; long NP; main verb; semantic features; sentence segmentation;
Conference_Titel :
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6896-6
DOI :
10.1109/NLPKE.2010.5587855