Title :
Refining Unit Boundaries for Mandarin Text-to-Speech Database
Author :
Dong, Minghui ; Cen, Ling ; Chan, Paul ; Li, Haizhou
Author_Institution :
Inst. for Infocomm Res. (I2R), A*STAR, Singapore, Singapore
Abstract :
In unit selection based text-to-speech (TTS) synthesis, the accurate position of the unit boundaries in the unit selection database is one of the factors that determine the quality of the synthesized speech. To ensure the accuracy of the boundary positions, developers often have to manually verify the speech boundaries that are generated by automatic speech recognition techniques. In order to reduce the manual workload, it is necessary to use automatic methods of refining the position of the unit boundaries. This paper proposes a frame-shift method to find the globally optimal joint position for unit concatenation between any two matching units. Experiment results show that this method can improve the boundary accuracy compared to manual labeling.
Keywords :
database management systems; speech recognition; speech synthesis; Mandarin text-to-speech database; automatic speech recognition techniques; frame-shift method; unit selection based text-to-speech synthesis; Automatic speech recognition; Databases; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Natural languages; Optimization methods; Signal processing algorithms; Speech processing; Speech synthesis; optimization; speech synthesis; unit boundary; unit selection;
Conference_Titel :
Asian Language Processing, 2009. IALP '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-0-7695-3904-1
DOI :
10.1109/IALP.2009.59