مرکز منطقه ای اطلاع رساني علوم و فناوري - Romanian language statistics and resources for text-to-speech systems

DocumentCode :

1918834

Title :

Romanian language statistics and resources for text-to-speech systems

Author :

Stan, Adriana ; Giurgiu, Mircea

Author_Institution :

Commun. Dept., Tech. Univ. of Cluj-Napoca, Cluj-Napoca, Romania

fYear :

2010

fDate :

11-12 Nov. 2010

Firstpage :

381

Lastpage :

384

Abstract :

This paper introduces a series of results and experiments used in the development of a Romanian text-to-speech system, focusing on text statistics. We investigate the presence of several linguistic units used in text-to-speech systems, from phonemes to words. The text corpus we used, News-Romanian (News-RO) comprises 4500 newspaper articles. A subset of it, around 2500 sentences represents the Romanian Speech Synthesis (RSS) recorded speech database. The results offer an important insight to how should a speech database be designed. We also describe the methods used in the development of a 50,000 words Romanian lexicon with phonetic transcription and accent positioning. Such a lexicon is useful in machine learning algorithms of the front-end part of a text-to-speech system. As an addition we study the use of Maximal Onset Principle for Romanian syllabification.

Keywords :

audio databases; natural language processing; speech synthesis; statistics; News-Romanian; Romanian Speech Synthesis recorded speech database; Romanian language statistics; Romanian lexicon; Romanian syllabification; Romanian text-to-speech system; accent positioning; machine learning algorithms; maximal onset principle; newspaper articles; phonemes; phonetic transcription; sentences; text statistics; words; Databases; Europe; High temperature superconductors; Speech; Speech synthesis; Text processing; Training; Romanian; lexicon; speech synthesis; text-to-speech;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Electronics and Telecommunications (ISETC), 2010 9th International Symposium on

Conference_Location :

Timisoara

Print_ISBN :

978-1-4244-8457-7

Type :

conf

DOI :

10.1109/ISETC.2010.5679318

Filename :

5679318

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1918834