DocumentCode :
3101589
Title :
Refining Unit Boundaries for Mandarin Text-to-Speech Database
Author :
Dong, Minghui ; Cen, Ling ; Chan, Paul ; Li, Haizhou
Author_Institution :
Inst. for Infocomm Res. (I2R), A*STAR, Singapore, Singapore
fYear :
2009
fDate :
7-9 Dec. 2009
Firstpage :
245
Lastpage :
248
Abstract :
In unit selection based text-to-speech (TTS) synthesis, the accurate position of the unit boundaries in the unit selection database is one of the factors that determine the quality of the synthesized speech. To ensure the accuracy of the boundary positions, developers often have to manually verify the speech boundaries that are generated by automatic speech recognition techniques. In order to reduce the manual workload, it is necessary to use automatic methods of refining the position of the unit boundaries. This paper proposes a frame-shift method to find the globally optimal joint position for unit concatenation between any two matching units. Experiment results show that this method can improve the boundary accuracy compared to manual labeling.
Keywords :
database management systems; speech recognition; speech synthesis; Mandarin text-to-speech database; automatic speech recognition techniques; frame-shift method; unit selection based text-to-speech synthesis; Automatic speech recognition; Databases; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Natural languages; Optimization methods; Signal processing algorithms; Speech processing; Speech synthesis; optimization; speech synthesis; unit boundary; unit selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing, 2009. IALP '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-0-7695-3904-1
Type :
conf
DOI :
10.1109/IALP.2009.59
Filename :
5380742
Link To Document :
بازگشت