DocumentCode :
2798758
Title :
Applying log linear model based context dependent machine translation techniques to grapheme-to-phoneme conversion
Author :
Zhang, Rong ; Zhou, Bowen
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4634
Lastpage :
4637
Abstract :
Grapheme-to-Phoneme conversion is a challenging task for speech recognition and text-to-speech systems for which the functionality of automatically predicting pronunciations for OOV words is highly desirable. In this paper, Grapheme-to-Phoneme conversion is viewed as a special case of sequence translation problem and we propose to tackle it with phrase based log-linear translation model. We improve standard machine translation method by utilizing context dependent units which lead to a better many-to-many alignment between chunks of graphemes and phonemes. Furthermore, hypotheses combination technique is applied to combine outputs generated by multiple translation models trained with different alignment units. Our proposed approach was evaluated on NetTalk and CMUDict datasets. Significant improvements on conversion accuracy are observed on both sets compared to conventional translation method: phoneme level error rates are reduced relatively by 18.4% and 22.5%, respectively. Our approach also performs better than or as good as previously published data driven methods examined on the same tasks.
Keywords :
language translation; speech recognition; speech synthesis; CMUDict datasets; NetTalk datasets; OOV words; context dependent machine translation techniques; grapheme-to-phoneme conversion; log linear model; multiple translation models; sequence translation problem; speech recognition; text-to-speech systems; Classification tree analysis; Context modeling; Dictionaries; Error analysis; Hidden Markov models; Machine learning algorithms; Prediction methods; Predictive models; Speech recognition; Speech synthesis; Grapheme-to-Phone conversion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495551
Filename :
5495551
Link To Document :
بازگشت