DocumentCode :
644004
Title :
Long sentence partitioning using top-down analysis for machine translation
Author :
Baosheng Yin ; Junjun Zuo ; Na Ye
Author_Institution :
Knowledge Eng. Res. Center, Shenyang Aerosp. Univ., Shenyang, China
Volume :
03
fYear :
2012
fDate :
Oct. 30 2012-Nov. 1 2012
Firstpage :
1425
Lastpage :
1429
Abstract :
Long sentence processing is an important part for English-Chinese machine translation systems. The system performance is directly affected by the correctness of long sentence processing. A basic thought for processing a long sentence is to partition it into short sub-sentences and to merge the sub-translations for the whole translation. In this paper, a rule-based top-down partitioning algorithm is provided. The rules are inducted from sentence patterns and use regular expressions as main part. Firstly, the algorithm reduces some sentence components to shorten the sentence; then coordinate sub-sentences are recognized and partitioned; finally, clauses within sub-sentences are processed. Experiment shows an approximate 85% accuracy and an over 90% recall rate of the algorithm.
Keywords :
language translation; natural language processing; English-Chinese machine translation systems; long sentence partitioning; regular expressions; rule-based top-down partitioning algorithm; sentence patterns; short sub-sentences; top-down analysis; Algorithm design and analysis; Google; Partitioning algorithms; Pattern matching; Pragmatics; Speech; Speech recognition; long sentence partitioning; machine translation; sentence pattern; top-down analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing and Intelligent Systems (CCIS), 2012 IEEE 2nd International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4673-1855-6
Type :
conf
DOI :
10.1109/CCIS.2012.6664620
Filename :
6664620
Link To Document :
بازگشت