DocumentCode :
3431044
Title :
Spectral conversion using deep neural networks trained with multi-source speakers
Author :
Li-Juan Liu ; Ling-Hui Chen ; Zhen-Hua Ling ; Li-Rong Dai
Author_Institution :
Nat. Eng. Lab. of Speech & Language Inf. Process., Hefei, China
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
4849
Lastpage :
4853
Abstract :
This paper presents a method for voice conversion using deep neural networks (DNNs) trained with multiple source speakers. The proposed DNNs can be used in two ways for different scenarios: 1) in the absence of training data for source speaker, the DNNs can be treated as source-speaker-independent models and perform conversions directly from arbitrary source speakers to certain target speaker; 2) the DNNs can also be used as initial models for further fine-tuning of source-speaker-dependent DNNs when parallel training data for both source and target speakers are available. Experimental results show that, as source-speaker-independent models, the proposed DNNs can achieve comparable performance to conventional source-speaker-dependent models. On the other hand, the proposed method outperforms the conventional initialization method with restricted Boltzmann machines (RBMs).
Keywords :
learning (artificial intelligence); neural nets; speaker recognition; speech processing; arbitrary source speaker; deep neural network; parallel training data; source speaker dependent DNN; source speaker independent model; spectral conversion; voice conversion; Artificial neural networks; Data models; Speech; Speech processing; Training; Training data; deep neural networks; source-speaker-independent mapping; voice conversion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178892
Filename :
7178892
Link To Document :
بازگشت