Title :
Statistically augmented preprocessing/normalization module for a Romanian text-to-speech system
Author :
Ungurean, Catalin ; Burileanu, Dragos ; Surmei, Mihai
Author_Institution :
Speech & Dialogue (SpeeD) Lab., Telecommun. & Inf. Technol. Univ. Politeh. of Bucharest, Bucharest, Romania
Abstract :
This paper addresses issues regarding the interdependence between sentence boundary detection (SBD), proper name detection (PND) and acronym/abbreviation detection (ABD) from the perspective of a preprocessing/ normalization module implementation as a first level in a Romanian text-to-speech (TTS) system. All these tasks have a major contribution to the intelligibility and naturalness of a synthesized text. Moreover, Romanian is still a scarce resource language and building algorithms for the automatic extraction of acronym/abbreviation and proper names from large text corpora helps obtaining more comprehensive resources for the TTS language processing stage. The paper proposes an improved preprocessing/normalization module for a high quality Romanian TTS system mainly by solving in a unified manner a number of difficult situations at the preprocessing level.
Keywords :
natural language processing; speech synthesis; statistical analysis; text analysis; ABD; PND; Romanian TTS system; Romanian text-to-speech system; SBD; TTS language processing; abbreviation detection; abbreviation extraction; acronym detection; automatic acronym extraction; building algorithm; proper name detection; resource language; sentence boundary detection; statistically augmented normalization module; statistically augmented preprocessing module; synthesized text intelligibility; synthesized text naturalness; text corpora; Accuracy; Dictionaries; Error analysis; Natural language processing; Pragmatics; Training; acronym/abbreviation detection; n-grams; natural language processing; proper name detection; sentence boundary detection; text-to-speech synthesis;
Conference_Titel :
Speech Technology and Human - Computer Dialogue (SpeD), 2013 7th Conference on
Conference_Location :
Cluj-Napoca
DOI :
10.1109/SpeD.2013.6682665