Title :
Applying log linear model based context dependent machine translation techniques to grapheme-to-phoneme conversion
Author :
Zhang, Rong ; Zhou, Bowen
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
Grapheme-to-Phoneme conversion is a challenging task for speech recognition and text-to-speech systems for which the functionality of automatically predicting pronunciations for OOV words is highly desirable. In this paper, Grapheme-to-Phoneme conversion is viewed as a special case of sequence translation problem and we propose to tackle it with phrase based log-linear translation model. We improve standard machine translation method by utilizing context dependent units which lead to a better many-to-many alignment between chunks of graphemes and phonemes. Furthermore, hypotheses combination technique is applied to combine outputs generated by multiple translation models trained with different alignment units. Our proposed approach was evaluated on NetTalk and CMUDict datasets. Significant improvements on conversion accuracy are observed on both sets compared to conventional translation method: phoneme level error rates are reduced relatively by 18.4% and 22.5%, respectively. Our approach also performs better than or as good as previously published data driven methods examined on the same tasks.
Keywords :
language translation; speech recognition; speech synthesis; CMUDict datasets; NetTalk datasets; OOV words; context dependent machine translation techniques; grapheme-to-phoneme conversion; log linear model; multiple translation models; sequence translation problem; speech recognition; text-to-speech systems; Classification tree analysis; Context modeling; Dictionaries; Error analysis; Hidden Markov models; Machine learning algorithms; Prediction methods; Predictive models; Speech recognition; Speech synthesis; Grapheme-to-Phone conversion;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495551