DocumentCode :
2179356
Title :
Vocal attractiveness of statistical speech synthesisers
Author :
Andraszewicz, Sandra ; Yamagishi, Junichi ; King, Simon
Author_Institution :
Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
5368
Lastpage :
5371
Abstract :
Our previous analysis of speaker-adaptive HMM-based speech synthesis methods suggested that there are two possible reasons why average voices can obtain higher subjective scores than any individual adapted voice: 1) model adaptation degrades speech quality proportionally to the distance ´moved´ by the transforms, and 2) psychoacoustic effects relating to the attractiveness of the voice. This paper is a follow-on from that analysis and aims to separate these effects out. Our latest perceptual experiments focus on attractiveness, using average voices and speaker-dependent voices without model trans formation, and show that using several speakers to create a voice improves smoothness (measured by Harmonics-to-Noise Ratio), reduces distance from the the average voice in the log F0-F1 space of the final voice and hence makes it more attractive at the segmental level. However, this is weakened or overridden at supra-segmental or sentence levels.
Keywords :
hidden Markov models; speaker recognition; speech synthesis; speaker-adaptive HMM-based speech synthesis methods; speaker-dependent voices; statistical speech synthesisers; vocal attractiveness; Analytical models; Atmospheric measurements; Correlation; Hidden Markov models; Particle measurements; Speech; Training data; HMM; attractiveness; average voice; speaker adaptation; speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947571
Filename :
5947571
Link To Document :
بازگشت