Title :
Use of statistical N-gram models in natural language generation for machine translation
Author :
Liu, Fu-Hua ; Gu, Liang ; Gao, Yuqing ; Picheny, Michael
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
Various language modeling issues in a speech-to-speech translation system are described in this paper. First, the language models for the speech recognizer need to be adapted to the specific domain to improve the recognition performance for in-domain utterances, while keeping the domain coverage as broad as possible. Second, when a maximum entropy based statistical natural language generation model is used to generate target language sentence as the translation output, serious inflection and synonym issues arise, because the compromised solution is used in semantic representation to avoid the data sparseness problem. We use N-gram models as a postprocessing step to enhance the generation performance. When an interpolated language model is applied to a Chinese-to-English translation task, the translation performance, measured by an objective metric of BLEU, improves substantially to 0.514 from 0.318 when we use the correct transcription as input. Similarly, the BLEU score is improved to 0.300 from 0.194 for the same task when the input is speech data.
Keywords :
language translation; maximum entropy methods; natural language interfaces; speech processing; speech recognition; statistical analysis; BLEU; Chinese-to-English translation task; in-domain utterances; inflection; interpolated language model; language modeling; language models; machine translation; maximum entropy; natural language generation; postprocessing step; recognition performance; semantic representation; speech recognizer; speech-to-speech translation system; statistical N-gram models; synonym; target language sentence; Data mining; Entropy; Globalization; Internet; Natural languages; Scheduling; Speech processing; Speech recognition; Speech synthesis; Switches;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1198861