DocumentCode :
1571676
Title :
Research of English-Chinese Alignment at Word Granularity on Parallel Corpora
Author :
Xu Yang ; Wang Hou-feng ; Lu Xue-qiang
Author_Institution :
Peking Univ., Beijing
fYear :
2008
Firstpage :
223
Lastpage :
228
Abstract :
Bilingual alignment is a crucial problem in the research of natural language processing, and word alignment is a nodus among all granularities of alignment. This paper describes an English-Chinese word alignment model based on a bilingual lexicon and some language knowledge, which works on bilingual corpora. The model is built on the theory of formal optimal partition of the bilingual sentence pairs, and is ubiquitous to sentences pairs of any natural language. Particularly, we obtain some alignment strategies which are independent to alignment direction by denoting some definitions and proving a theorem. The model deals with part-matching cases, solves multi-appear- word problems and remedies the deficiency of bilingual lexicon. The experimental results show that the model can align bilingual corpora at word level effectively with a high accuracy, and maintain the grammar structure of the original sentences at the same time.
Keywords :
natural language processing; word processing; English-Chinese alignment; bilingual alignment; bilingual lexicon; bilingual sentence pairs; language knowledge; natural language processing; parallel corpora; word granularity; Concurrent computing; Dictionaries; Information science; Large-scale systems; Machine learning; Natural language processing; Natural languages; Parallel processing; Statistical analysis; Statistics; anchor; bilingual corpora; word alignment;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on
Conference_Location :
Portland, OR
Print_ISBN :
978-0-7695-3131-1
Type :
conf
DOI :
10.1109/ICIS.2008.28
Filename :
4529824
Link To Document :
بازگشت