DocumentCode :
3124952
Title :
Experiments on unsupervised statistical parametric speech synthesis
Author :
Jinfu Ni ; Shiga, Yoshinori ; Kawai, Hiroyuki ; Kashioka, Hideki
Author_Institution :
Spoken Language Commun. Lab., Universal Commun. Res. Inst., Kyoto, Japan
fYear :
2012
fDate :
5-8 Dec. 2012
Firstpage :
155
Lastpage :
159
Abstract :
In order to build web-based voicefonts, an unsupervised method is needed to automate the extraction of acoustic and linguistic properties of speech. This paper addresses the impact of automatic speech transcription on statistical parametric speech synthesis based on a single speaker´s 100 hour speech corpus, focusing particularly on two factors of affecting speech quality: transcript accuracy and size of training dataset. Experimental results indicate that for an unsupervised method to achieve fair (MOS 3) voice quality, 1.5 hours of speech are necessary for phone accuracy over 80% and 3.5 hours necessary for phone accuracy down to 65%. Improvement in MOS quality turns out not to be significant when more than 4 hours of speech are used. The usage of automatic transcripts certainly leads to voice degradation. One of the mechanisms behind this is that transcript errors cause mismatches between speech segments and phone labels that significantly distort the structures of decision trees in resultant HMM-based voices.
Keywords :
decision trees; hidden Markov models; speech synthesis; unsupervised learning; HMM-based voices; MOS quality; Web-based voicefonts; acoustic properties; decision trees; linguistic properties; phone accuracy; speech corpus; speech quality; training dataset; transcript accuracy; unsupervised statistical parametric speech synthesis; voice degradation; Accuracy; Buildings; Decision trees; Degradation; Hidden Markov models; Speech; Speech synthesis; HMM-based speech synthesis; Voice degradation; automatic speech transcription; unsupervised method; web-based voicefonts;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
Type :
conf
DOI :
10.1109/ISCSLP.2012.6423518
Filename :
6423518
Link To Document :
بازگشت