Title :
Advances in syntax-based Malay-English speech translation
Author :
Bing Xiang;Bowen Zhou;Martin Cmejrek
Author_Institution :
IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA
Abstract :
In this paper, we present advanced techniques that improved the performance of IBM Malay-English speech translation system significantly. During this work, we generated linguistics-driven hierarchical rules to enhance the formal syntax-based translation model; designed an active learning approach with bi-directional translations that outperformed unsupervised training; utilized translation direction information in parallel training corpus to build direction-specific interpolated language models for machine translation. There is 20% relative improvement achieved in the translation performance through all these techniques. A state-of-the-art Malay speech recognition system was also established as one of the crucial modules in the rapidly developed Malay-English speech translation.
Keywords :
"Speech recognition","Natural languages","Machine learning","Bidirectional control","Automatic speech recognition","Data mining","Tagging","Training data","Semisupervised learning","Humans"
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
2379-190X
DOI :
10.1109/ICASSP.2009.4960705