مرکز منطقه ای اطلاع رساني علوم و فناوري - An unsupervised boosting technique for refiningword alignment

DocumentCode :

2329963

Title :

An unsupervised boosting technique for refiningword alignment

Author :

Ananthakrishnan, Sankaranarayanan ; Prasad, Rohit ; Natarajan, Prem

Author_Institution :

Raytheon BBN Technol., Cambridge, MA, USA

fYear :

2010

fDate :

12-15 Dec. 2010

Firstpage :

177

Lastpage :

182

Abstract :

Translation rules extracted from automatic word alignment form the basis of statistical machine translation (SMT) systems. An unsupervised expectation-maximization (EM) algorithm is typically used to obtain a word alignment from parallel corpora. Being statistically-driven, the alignments produced by this technique are often erroneous. In this paper, we propose an unsupervised boosting strategy for refining automatic word alignment with the goal of improving SMT performance. The proposed approach results in fewer unaligned words, a significant reduction in the number of extracted translation phrase pairs, a corresponding improvement in SMT decoding speed, and a consistent improvement in translation accuracy, as measured by BLEU, across multiple language pairs and test sets. The reduction in storage and processing requirements coupled with improved accuracy make the proposed technique ideally suited for interactive translation services, facilitating applications such as mobile speech-to-speech translation.

Keywords :

language translation; speech processing; statistical analysis; unsupervised learning; EM; SMT; automatic word alignment; interactive translation services; parallel corpora; speech-to-speech translation; statistical machine translation; translation rules extraction; unsupervised boosting technique; unsupervised expectation-maximization; word alignment refinement; boosting; mobile speech-to-speech translation; statistical machine translation; word alignment;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language Technology Workshop (SLT), 2010 IEEE

Conference_Location :

Berkeley, CA

Print_ISBN :

978-1-4244-7904-7

Electronic_ISBN :

978-1-4244-7902-3

Type :

conf

DOI :

10.1109/SLT.2010.5700847

Filename :

5700847

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2329963