DocumentCode :
3618596
Title :
Lexical stress assignment model for the Slovenian text-to-speech synthesis system
Author :
T. Sef
Author_Institution :
Jozef Stefan Inst., Ljubljana Univ., Slovenia
fYear :
2004
fDate :
6/26/1905 12:00:00 AM
Firstpage :
683
Lastpage :
686
Abstract :
One of the characteristics of the Slovenian language is that lexical stress can be located almost arbitrarily on every syllable in the word, which makes the pronunciation very difficult. Some pronunciation rules exist, but their precision is not sufficient for efficient speech synthesis. Therefore a machine-learning technique (decision trees or boosted decision trees) was applied in order to achieve better results. The paper presents a two level lexical stress assignment model for out of vocabulary Slovenian words used in our text-to-speech system. First, each vowel is determined, whether it is stressed or unstressed, and a type of lexical stress is assigned for every stressed vowel. Then, some corrections are made on the word level, according to the number of stressed vowels and the length of the word. For data sets we used the MULTEXT-East Slovene Lexicon, which was supplemented with lexical stress marks. The accuracy achieved by decision trees significantly outperforms all previous results. However, the sizes of the trees indicate that the accentuation in the Slovenian language is a very complex problem and a simple solution in the form of relatively simple rules is not possible.
Keywords :
"Stress","Speech synthesis","Tires","Dictionaries","Databases","Decision trees","Tree graphs","Intelligent systems","Tiles","Vocabulary"
Publisher :
ieee
Conference_Titel :
Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on
Print_ISBN :
0-7803-8687-6
Type :
conf
DOI :
10.1109/ISIMP.2004.1434156
Filename :
1434156
Link To Document :
بازگشت