مرکز منطقه ای اطلاع رساني علوم و فناوري - Towards automatic phonetic segmentation for TTS

DocumentCode :

3164173

Title :

Towards automatic phonetic segmentation for TTS

Author :

Rendel, Asaf ; Sorin, Alexander ; Hoory, Ron ; Breen, Andrew

Author_Institution :

Speech Technol., IBM Haifa Res. Lab., Haifa, Israel

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4533

Lastpage :

4536

Abstract :

Phonetic segmentation is an important step in the development of a concatenative TTS voice. This paper introduces a segmentation process consisting of two phases. First, forced alignment is performed using an HMM-GMM model. The resulting segmentation is then locally refined using an SVM based boundary model. Both the models are derived from multi-speaker data using a speaker adaptive training procedure. Evaluation results are obtained on the TIMIT corpus and on a proprietary single-speaker TTS corpus.

Keywords :

Gaussian processes; hidden Markov models; speech processing; support vector machines; Gaussian mixture model; HMM-GMM model; SVM based boundary model; TIMIT corpus; automatic phonetic segmentation; concatenative TTS voice; hidden Markov models; multispeaker data; single-speaker TTS corpus; speaker adaptive training procedure; support vector machines; Accuracy; Acoustics; Adaptation models; Data models; Hidden Markov models; Training; Vectors; Phoneme alignment; Phonetic segmentation; Text to speech;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6288926

Filename :

6288926

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3164173