• DocumentCode
    900383
  • Title

    Nonparallel training for voice conversion based on a parameter adaptation approach

  • Author

    Mouchtaris, Athanasios ; Van der Spiegel, Jan ; Mueller, Paul

  • Author_Institution
    Electr. & Syst. Eng. Dept., Univ. of Pennsylvania, Philadelphia, PA, USA
  • Volume
    14
  • Issue
    3
  • fYear
    2006
  • fDate
    5/1/2006 12:00:00 AM
  • Firstpage
    952
  • Lastpage
    963
  • Abstract
    The objective of voice conversion algorithms is to modify the speech by a particular source speaker so that it sounds as if spoken by a different target speaker. Current conversion algorithms employ a training procedure, during which the same utterances spoken by both the source and target speakers are needed for deriving the desired conversion parameters. Such a (parallel) corpus, is often difficult or impossible to collect. Here, we propose an algorithm that relaxes this constraint, i.e., the training corpus does not necessarily contain the same utterances from both speakers. The proposed algorithm is based on speaker adaptation techniques, adapting the conversion parameters derived for a particular pair of speakers to a different pair, for which only a nonparallel corpus is available. We show that adaptation reduces the error obtained when simply applying the conversion parameters of one pair of speakers to another by a factor that can reach 30%. A speaker identification measure is also employed that more insightfully portrays the importance of adaptation, while listening tests confirm the success of our method. Both the objective and subjective tests employed, demonstrate that the proposed algorithm achieves comparable results with the ideal case when a parallel corpus is available.
  • Keywords
    speaker recognition; speech synthesis; nonparallel training; parameter adaptation approach; speaker adaptation techniques; speaker identification measure; voice conversion algorithms; Adaptation model; Computer science; Loudspeakers; Speech enhancement; Speech processing; Speech synthesis; Systems engineering and theory; Telephony; Testing; Vector quantization; Gaussian mixture model; speaker adaptation; text-to-speech synthesis; voice conversion;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TSA.2005.857790
  • Filename
    1621207