• DocumentCode
    3341337
  • Title

    A system for voice conversion based on probabilistic classification and a harmonic plus noise model

  • Author

    Styliano, Yannis ; Cappé, Olivier

  • Author_Institution
    Res. Labs., AT&T Labs., Florham Park, NJ, USA
  • Volume
    1
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    281
  • Abstract
    Voice conversion is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). This paper describes a system for efficient voice conversion. A novel mapping function is presented which associates the acoustic space of the source speaker with the acoustic space of the target speaker. The proposed system is based on the use of a Gaussian mixture model, GMM, to model the acoustic space of a speaker and a pitch synchronous harmonic plus noise representation of the speech signal for prosodic modifications. The mapping function is a continuous parametric function which takes into account the probabilistic classification provided by the mixture model (GMM). Evaluation by objective tests showed that the proposed system was able to reduce the perceptual distance between the source and target speaker by 70%. Formal listening tests also showed that 97% of the converted speech was judged to be spoken from the target speaker while maintaining high speech quality
  • Keywords
    Gaussian processes; harmonics; noise; signal representation; speech intelligibility; speech processing; speech synthesis; Gaussian mixture model; acoustic space; continuous parametric function; formal listening tests; harmonic plus noise model; high speech quality; interpreted telephony; low rate bit speech coding; mapping function; objective tests; perceptual distance reduction; pitch synchronous harmonics; probabilistic classification; probability; prosodic modifications; source speaker; speech signal representation; target speaker; text-to-speech synthesis; voice conversion system; Acoustic noise; Acoustic testing; Gaussian noise; Loudspeakers; Speech coding; Speech recognition; Speech synthesis; System testing; Telephony; Vector quantization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.674422
  • Filename
    674422