Title :
A system for voice conversion based on probabilistic classification and a harmonic plus noise model
Author :
Styliano, Yannis ; Cappé, Olivier
Author_Institution :
Res. Labs., AT&T Labs., Florham Park, NJ, USA
Abstract :
Voice conversion is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). This paper describes a system for efficient voice conversion. A novel mapping function is presented which associates the acoustic space of the source speaker with the acoustic space of the target speaker. The proposed system is based on the use of a Gaussian mixture model, GMM, to model the acoustic space of a speaker and a pitch synchronous harmonic plus noise representation of the speech signal for prosodic modifications. The mapping function is a continuous parametric function which takes into account the probabilistic classification provided by the mixture model (GMM). Evaluation by objective tests showed that the proposed system was able to reduce the perceptual distance between the source and target speaker by 70%. Formal listening tests also showed that 97% of the converted speech was judged to be spoken from the target speaker while maintaining high speech quality
Keywords :
Gaussian processes; harmonics; noise; signal representation; speech intelligibility; speech processing; speech synthesis; Gaussian mixture model; acoustic space; continuous parametric function; formal listening tests; harmonic plus noise model; high speech quality; interpreted telephony; low rate bit speech coding; mapping function; objective tests; perceptual distance reduction; pitch synchronous harmonics; probabilistic classification; probability; prosodic modifications; source speaker; speech signal representation; target speaker; text-to-speech synthesis; voice conversion system; Acoustic noise; Acoustic testing; Gaussian noise; Loudspeakers; Speech coding; Speech recognition; Speech synthesis; System testing; Telephony; Vector quantization;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.674422