DocumentCode
454680
Title
Database Pruning for Unsupervised Building of Text-To-Speech Voices
Author
Adell, Jordi ; Agüero, Pablo Daniel ; Bonafonte, Antonio
Author_Institution
Dept. of Signal Theory & Comunications, Univ. Politecnica de Catalunya
Volume
1
fYear
2006
fDate
14-19 May 2006
Abstract
Unit selection speech synthesis techniques lead the speech synthesis state of the art. Automatic segmentation of databases is necessary in order to build new voices. They may contain errors and segmentation processes may introduce some more. Quality systems require a significant effort to find and correct these segmentation errors. Phonetic transcription is crucial and is one of the manually supervised tasks. The possibility to automatically remove incorrectly transcribed units from the inventory will help to make the process more automatic. Here we present a new technique based on speech recognition confidence measures that reaches to remove 90% of incorrectly transcribed units from a database. The cost for it is loosing only a 10% of correctly transcribed units
Keywords
speech recognition; speech synthesis; automatic segmentation; database pruning; phonetic transcription; speech recognition; text-to-speech voices; unit selection speech synthesis; unsupervised building; Art; Costs; Databases; Error correction; Hidden Markov models; Inventory management; Natural languages; Speech recognition; Speech synthesis; Synthesizers;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location
Toulouse
ISSN
1520-6149
Print_ISBN
1-4244-0469-X
Type
conf
DOI
10.1109/ICASSP.2006.1660164
Filename
1660164
Link To Document