Title :
One sentence voice adaptation using GMM-based frequency-warping and shift with a sub-band basis spectrum model
Author :
Tamura, Masatsune ; Morita, Masahiro ; Kagoshima, Takehiko ; Akamine, Masami
Author_Institution :
Knowledge Media Lab., Toshiba Corp., Kawasaki, Japan
Abstract :
This paper presents a rapid voice adaptation algorithm using GMM-based frequency warping and shift with parameters of a sub band basis spectrum model (SBM). The SBM parameter represents a shape of a spectrum of speech. It is calculated by fitting a sub-band basis to the log-spectrum. Since the parameter is the frequency domain representation, frequency warping can be directly applied to the SBM parameter. A frequency warping function that minimize the distance between source and target SBM parameter pairs in each mixture component of a GMM is derived using a DP (Dynamic programming) algorithm. The proposed method is evaluated in an unit-selection based voice adaptation framework applied to a unit-fusion based text-to-speech synthesizer. The experimental results show that the proposed adaptation method is effective for rapid voice adaptation using just one sentence, compared to the conventional GMM.-based linear transformation of mel-cepstra.
Keywords :
Gaussian processes; dynamic programming; speech synthesis; DP algorithm; GMM-based frequency-warping; SBM parameter; dynamic programming algorithm; one sentence voice adaptation; speech spectrum; subband basis spectrum model; unit-fusion based text-to-speech synthesizer; unit-selection based voice adaptation framework; voice adaptation algorithm; Databases; Frequency conversion; Frequency domain analysis; Shape; Speech; Speech synthesis; Training; frequency warping; sub-band basis spectrum model; unit fusion speech synthesis; voice adaptation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947510