• DocumentCode
    454528
  • Title

    Towards Pooled-Speaker Concatenative Text-to-Speech

  • Author

    Eide, E.M. ; Picheny, A.

  • Author_Institution
    IBM Thomas J. Watson Res. Center
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    In this paper we explore the merging of data from various speakers in building a concatenative text-to-speech system. First, we investigate the pooling of data from multiple speakers for building statistical models to predict pitch and duration, and present listening test results which show that the expressiveness of our TTS system is improved using these techniques. Additionally, we describe an experiment in which we merged databases from several speakers to form an enlarged database from which our concatenative text-to-speech system draws segments. We present listening test results which show that pooling data from several speakers yields higher quality synthetic speech in general domains than restricting ourselves to the data from just one speaker in our repertoire
  • Keywords
    speech processing; text analysis; pooled-speaker concatenative text-to-speech; statistical models; synthetic speech; Databases; Decision trees; Engines; Loudspeakers; Merging; Predictive models; Runtime; Speech synthesis; System testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1659960
  • Filename
    1659960