• DocumentCode
    3124677
  • Title

    A unified trajectory tiling approach to high quality TTS and cross-lingual voice transformation

  • Author

    Yao Qian ; Soong, Frank K.

  • Author_Institution
    Microsoft Res. Asia, Beijing, China
  • fYear
    2012
  • fDate
    5-8 Dec. 2012
  • Firstpage
    165
  • Lastpage
    169
  • Abstract
    In human-machine speech communication, it is technically challenging to make the machine talk as naturally as human so as to facilitate “frictionless” interactions, or make a human user to feel the communication is as natural as human-human. We propose a trajectory tiling approach to high quality speech synthesis, where the speech parameter trajectories, extracted from natural, processed, or synthesized speech, are used to guide the search for the best sequence of waveform segment “tiles” stored in a pre-recorded speech database. We test our approach in both TTS and cross-lingual voice transformation applications. Experimental results show that the proposed trajectory tiling approach can render speech which is both natural and highly intelligible. The perceived high quality speech is also confirmed in objective and subjective tests.
  • Keywords
    human computer interaction; speech intelligibility; speech synthesis; waveform analysis; best waveform segment tile sequence search; cross-lingual voice transformation applications; high quality TTS; high quality speech synthesis; human-machine speech communication; natural communication; speech database; speech intelligibility; speech parameter trajectories; speech rendering; unified trajectory tiling approach; Hidden Markov models; Rendering (computer graphics); Speech; Speech synthesis; Tiles; Training data; Trajectory; cross-lingual; speech synthesis; trajectory tiling; voice transformation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
  • Conference_Location
    Kowloon
  • Print_ISBN
    978-1-4673-2506-6
  • Electronic_ISBN
    978-1-4673-2505-9
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2012.6423506
  • Filename
    6423506