DocumentCode :
3095815
Title :
Log Likelihood Ratio Based Annotation Verification of a Norwegian Speech Synthesis Database
Author :
Amdal, Ingunn ; Johnsen, Magne Hallstein ; Svendsen, Torbjrn
Author_Institution :
Dept. of Electron. & Telecommun., Norwegian Univ. of Sci. & Technol., Trondheim
fYear :
2006
fDate :
38869
Firstpage :
186
Lastpage :
189
Abstract :
Accurate labeling and segmentation of the unit inventory database is of vital importance to the quality of unit selection text-to-speech synthesis. Misalignments and mismatch between the predicted and pronounced unit sequences require manual correction to achieve natural sounding synthesis. In this paper we have used a log likelihood ratio based utterance verification to automatically detect annotation errors in a Norwegian two-speaker synthesis database. Each sentence is assigned a confidence score and those falling below a threshold can be discarded or manually inspected and corrected. Using equal reject number as a criterion the transcription sentence error rate was reduced from 9.8% to 2.7%. Insertions are the largest error category, and 95.6% of these were detected. A closer inspection of false rejections was performed to assess (and improve) the phoneme prediction system
Keywords :
audio databases; natural languages; speaker recognition; speech synthesis; Norwegian two-speaker synthesis database; annotation verification; log likelihood ratio based utterance verification; phoneme prediction system; Acoustic signal detection; Automatic speech recognition; Error analysis; Inspection; Labeling; Loudspeakers; Natural languages; Spatial databases; Speech synthesis; Uniform resource locators;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Symposium, 2006. NORSIG 2006. Proceedings of the 7th Nordic
Conference_Location :
Rejkjavik
Print_ISBN :
1-4244-0412-6
Electronic_ISBN :
1-4244-0413-4
Type :
conf
DOI :
10.1109/NORSIG.2006.275206
Filename :
4052201
Link To Document :
بازگشت