DocumentCode :
3164173
Title :
Towards automatic phonetic segmentation for TTS
Author :
Rendel, Asaf ; Sorin, Alexander ; Hoory, Ron ; Breen, Andrew
Author_Institution :
Speech Technol., IBM Haifa Res. Lab., Haifa, Israel
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4533
Lastpage :
4536
Abstract :
Phonetic segmentation is an important step in the development of a concatenative TTS voice. This paper introduces a segmentation process consisting of two phases. First, forced alignment is performed using an HMM-GMM model. The resulting segmentation is then locally refined using an SVM based boundary model. Both the models are derived from multi-speaker data using a speaker adaptive training procedure. Evaluation results are obtained on the TIMIT corpus and on a proprietary single-speaker TTS corpus.
Keywords :
Gaussian processes; hidden Markov models; speech processing; support vector machines; Gaussian mixture model; HMM-GMM model; SVM based boundary model; TIMIT corpus; automatic phonetic segmentation; concatenative TTS voice; hidden Markov models; multispeaker data; single-speaker TTS corpus; speaker adaptive training procedure; support vector machines; Accuracy; Acoustics; Adaptation models; Data models; Hidden Markov models; Training; Vectors; Phoneme alignment; Phonetic segmentation; Text to speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6288926
Filename :
6288926
Link To Document :
بازگشت