DocumentCode :
3527500
Title :
Voice conversion based on simultaneous modelling of spectrum and F0
Author :
Yutani, Kaori ; Uto, Yosuke ; Nankaku, Yoshihiko ; Lee, Akinobu ; Tokuda, Keiichi
Author_Institution :
Dept. of Comput. Sci. & Eng., Nagoya Inst. of Technol., Nagoya
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
3897
Lastpage :
3900
Abstract :
This paper proposes a simultaneous modeling of spectrum and F0 for voice conversion based on MSD (multi-space probability distribution) models. As a conventional technique, a spectral conversion based on GMM (Gaussian mixture model) has been proposed. Although this technique converts spectral feature sequences nonlinearly based on GMM, F0 sequences are usually converted by a simple linear function. This is because F0 is undefined in unvoiced segments. To overcome this problem, we apply MSD models. The MSD-GMM allows to model continuous F0 values in voiced frames and a discrete symbol representing unvoiced frames within an unified framework. Furthermore, the MSD-HMM is adopted to model long term correlations in F0 sequences.
Keywords :
Gaussian processes; correlation methods; hidden Markov models; sequences; spectral analysis; speech processing; statistical distributions; GMM; Gaussian mixture model; HMM; correlation method; discrete symbol; linear function; multispace probability distribution; nonlinear spectral feature sequence; simultaneous F0 modelling; simultaneous spectrum modelling; voice conversion; Computer science; Covariance matrix; Data mining; Probability distribution; Speech; Yttrium; F0 conversion; MSD-GMM; MSD-HMM; voice conversion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960479
Filename :
4960479
Link To Document :
بازگشت