Title :
Voice conversion with linear prediction residual estimaton
Author :
Percybrooks, Winston S. ; Moore, Elliot, II
fDate :
March 31 2008-April 4 2008
Abstract :
The work presented here shows a comparison between a voice conversion system based on converting only the vocal tract representation of the source speaker and an augmented system that adds an algorithm for estimating the target excitation signal. The estimation algorithm uses a stochastic model for relating the excitation signal to the vocal tract features. The two systems were subjected to objective and subjective tests for assessing the effectiveness of the perceived identity conversion and the overall quality of the synthesized speech. Male-to-male and female-to- female conversion cases were tested. The main objective of this work is to improve the recognizability of the converted speech while maintaining a high synthesis quality.
Keywords :
prediction theory; signal representation; speech synthesis; stochastic processes; augmented system; linear prediction residual estimation; speech recognition; speech synthesis; stochastic model; target excitation signal estimation; vocal tract representation; voice conversion system; Filters; Frequency conversion; Frequency estimation; Frequency synthesizers; Signal synthesis; Speech recognition; Speech synthesis; Stochastic processes; System testing; Virtual colonoscopy; GMM; LP residual estimation; Linear spectral frequencies; Voice conversion;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518699