مرکز منطقه ای اطلاع رساني علوم و فناوري - Joint tokenization, parsing, and translation

DocumentCode :

1615849

Title :

Joint tokenization, parsing, and translation

Author :

Liu, Yang

Author_Institution :

Inst. of Comput. Technol. (ICT), Chinese Acad. of Sci., Beijing, China

fYear :

2010

Firstpage :

Lastpage :

Abstract :

Summary form only given. Natural language processing is all about ambiguities. In machine translation, tokenization and parsing mistakes due to segmentation and structural ambiguities potentially introduce translation errors. A well-known solution is to provide more alternatives by using compact representations such as lattice and forest. In this talk, I will introduce a technique that goes beyond using lattices and forests, which integrates tokenization, parsing, and translation in one system. Therefore, tokenization, parsing, and translation can interact with and benefit each other in a discriminative framework. Experimental results show that such integration significantly improves tokenization and translation performance.

Keywords :

language translation; natural language processing; forest technique; joint tokenization; lattice technique; machine translation; natural language processing; parsing; translation error;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Universal Communication Symposium (IUCS), 2010 4th International

Conference_Location :

Beijing

Print_ISBN :

978-1-4244-7821-7

Type :

conf

DOI :

10.1109/IUCS.2010.5666651

Filename :

5666651

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1615849