DocumentCode :
3697460
Title :
Many-to-one voice conversion using exemplar-based sparse representation
Author :
Ryo Aihara;Tetsuya Takiguchi;Yasuo Ariki
Author_Institution :
Graduate School of System Informatics, Kobe University, Japan
fYear :
2015
Firstpage :
1
Lastpage :
5
Abstract :
Voice conversion (VC) is being widely researched in the field of speech processing because of increased interest in using such processing in applications such as personalized Text-to-Speech systems. We present in this paper a many-to-one VC method using exemplar-based sparse representation, which is different from conventional statistical VC. In our previous exemplar-based VC method, input speech was represented by the source dictionary and its sparse coefficients. The source and the target dictionaries are fully coupled and the converted voice is constructed from the source coefficients and the target dictionary. This method requires parallel exemplars (which consist of the source exemplars and target exemplars that have the same texts uttered by the source and target speakers) for dictionary construction. In this paper, we propose a many-to-one VC method in an exemplar-based framework which does not need training data of the source speaker. Some statistical approaches for many-to-one VC have been proposed; however, in the framework of exemplar-based VC, such a method has never been proposed. The effectiveness of our many-to-one VC has been confirmed by comparing its effectiveness with that of a conventional one-to-one NMF-based method and one-to-one GMM-based method.
Keywords :
"Dictionaries","Speech","Training data","Sparse matrices","Matrix converters","Signal processing","Noise robustness"
Publisher :
ieee
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015 IEEE Workshop on
Type :
conf
DOI :
10.1109/WASPAA.2015.7336943
Filename :
7336943
Link To Document :
بازگشت