DocumentCode :
3425258
Title :
Improving letter-to-sound conversion performance with automatically generated new words
Author :
You, Jia-Li ; Chen, Yi-Ning ; Soong, Frank K. ; Wang, Jin-Lin
Author_Institution :
Inst. of Acoust., Chinese Acad. of Sci., Beijing
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
4653
Lastpage :
4656
Abstract :
We propose a novel way to alleviate the data sparseness problem in training letter-to-sound (LTS) N-gram models by adding automatically generated new words to the training set. The proposed method consists of two procedures: (1) generating a large pool of new words automatically; (2) selecting good new word candidates from the new word pool via semi-supervised learning. The new words are created by replacing stressed syllables of an existing word with other stressed syllables under specified contextual constraints. The new word selection by semi-supervised learning is based upon consistent pronunciation predictions by different LTS models. After adding new words to the training set, the performance of LTS conversion is significantly improved. For the NetTalk dictionary, compared with the performance from the N-gram baseline model, 21.6% relative word error rate reduction is obtained. For the CMU dictionary, 9.1% and 5.6% relative word error rate reductions are obtained, respectively, with/without considering the stress.
Keywords :
learning (artificial intelligence); natural language processing; speech processing; N-gram baseline model; NetTalk dictionary; consistent pronunciation predictions; data sparseness problem; letter-to-sound N-gram models; letter-to-sound conversion performance; new word pool; new word selection; semi-supervised learning; Acoustics; Asia; Decision trees; Dictionaries; Error analysis; Hidden Markov models; Predictive models; Semisupervised learning; Speech; Stress; Letter-to-Sound; artificial data; data sparseness; semi-supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518694
Filename :
4518694
Link To Document :
بازگشت