Title :
Refining segmental boundaries for TTS database using fine contextual-dependent boundary models
Author :
Wang, Lijuan ; Zhao, Yong ; Chu, Min ; Zhou, Jianlai ; Cao, Zhigang
Author_Institution :
Dept. of Electr. Eng., Univ. of Tsinghua, Beijing, China
Abstract :
This paper proposed a post-refining method with fine contextual-dependent GMM for the auto-segmentation task. A GMM trained with a super feature vector extracted from multiple evenly spaced frames near the boundary is suggested to describe the waveform evolution across a boundary. CART is used to cluster acoustically similar GMM, so that the GMM for each leaf node is reliably trained by the limited manually labeled boundaries. An accuracy of 90% is thus achieved when only 250 manually labeled sentences are provided to train the refining models.
Keywords :
Gaussian distribution; feature extraction; hidden Markov models; pattern clustering; speech synthesis; CART; GMM; TTS database; auto-segmentation task; fine contextual-dependent boundary models; manually labeled boundaries; multiple evenly spaced frames; pattern clustering; post-refining method; segmental boundaries; super feature vector extraction; waveform evolution; Asia; Automatic speech recognition; Context modeling; Databases; Feature extraction; Hidden Markov models; Labeling; Neural networks; Speech synthesis; Viterbi algorithm;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326067