مرکز منطقه ای اطلاع رساني علوم و فناوري - Applying log linear model based context dependent machine translation techniques to grapheme-to-phoneme conversion

DocumentCode :

2798758

Title :

Applying log linear model based context dependent machine translation techniques to grapheme-to-phoneme conversion

Author :

Zhang, Rong ; Zhou, Bowen

Author_Institution :

IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2010

fDate :

14-19 March 2010

Firstpage :

4634

Lastpage :

4637

Abstract :

Grapheme-to-Phoneme conversion is a challenging task for speech recognition and text-to-speech systems for which the functionality of automatically predicting pronunciations for OOV words is highly desirable. In this paper, Grapheme-to-Phoneme conversion is viewed as a special case of sequence translation problem and we propose to tackle it with phrase based log-linear translation model. We improve standard machine translation method by utilizing context dependent units which lead to a better many-to-many alignment between chunks of graphemes and phonemes. Furthermore, hypotheses combination technique is applied to combine outputs generated by multiple translation models trained with different alignment units. Our proposed approach was evaluated on NetTalk and CMUDict datasets. Significant improvements on conversion accuracy are observed on both sets compared to conventional translation method: phoneme level error rates are reduced relatively by 18.4% and 22.5%, respectively. Our approach also performs better than or as good as previously published data driven methods examined on the same tasks.

Keywords :

language translation; speech recognition; speech synthesis; CMUDict datasets; NetTalk datasets; OOV words; context dependent machine translation techniques; grapheme-to-phoneme conversion; log linear model; multiple translation models; sequence translation problem; speech recognition; text-to-speech systems; Classification tree analysis; Context modeling; Dictionaries; Error analysis; Hidden Markov models; Machine learning algorithms; Prediction methods; Predictive models; Speech recognition; Speech synthesis; Grapheme-to-Phone conversion;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location :

Dallas, TX

ISSN :

1520-6149

Print_ISBN :

978-1-4244-4295-9

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2010.5495551

Filename :

5495551

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2798758