DocumentCode
3618596
Title
Lexical stress assignment model for the Slovenian text-to-speech synthesis system
Author
T. Sef
Author_Institution
Jozef Stefan Inst., Ljubljana Univ., Slovenia
fYear
2004
fDate
6/26/1905 12:00:00 AM
Firstpage
683
Lastpage
686
Abstract
One of the characteristics of the Slovenian language is that lexical stress can be located almost arbitrarily on every syllable in the word, which makes the pronunciation very difficult. Some pronunciation rules exist, but their precision is not sufficient for efficient speech synthesis. Therefore a machine-learning technique (decision trees or boosted decision trees) was applied in order to achieve better results. The paper presents a two level lexical stress assignment model for out of vocabulary Slovenian words used in our text-to-speech system. First, each vowel is determined, whether it is stressed or unstressed, and a type of lexical stress is assigned for every stressed vowel. Then, some corrections are made on the word level, according to the number of stressed vowels and the length of the word. For data sets we used the MULTEXT-East Slovene Lexicon, which was supplemented with lexical stress marks. The accuracy achieved by decision trees significantly outperforms all previous results. However, the sizes of the trees indicate that the accentuation in the Slovenian language is a very complex problem and a simple solution in the form of relatively simple rules is not possible.
Keywords
"Stress","Speech synthesis","Tires","Dictionaries","Databases","Decision trees","Tree graphs","Intelligent systems","Tiles","Vocabulary"
Publisher
ieee
Conference_Titel
Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on
Print_ISBN
0-7803-8687-6
Type
conf
DOI
10.1109/ISIMP.2004.1434156
Filename
1434156
Link To Document