• DocumentCode
    352326
  • Title

    Transition-based speech synthesis using neural networks

  • Author

    Corrigan, G. ; Massey, N. ; Schnurr, O.

  • Author_Institution
    Motorola Labs., Schaumburg, IL, USA
  • Volume
    2
  • fYear
    2000
  • fDate
    2000
  • Abstract
    Prior attempts to use neural networks to synthesize speech from a phonetic representation have used the neural network to generate a frame of input to a vocoder. As this requires the neural network to compute one output for each frame of speech from the vocoder, this can be computationally expensive. An alternative implementation is to model the speech as a series of gestures, and let the neural network generate parameters describing the transitions of the vocoder parameters during these gestures. Experiments have shown that acceptable speech quality is produced when each gesture is half of a phonetic segment and the transition model is a set of cubic polynomials describing the variation of each vocoder parameter during the gesture. This results in a significant reduction in computational cost
  • Keywords
    neural nets; polynomial approximation; speech synthesis; vocoders; computational cost reduction; cubic polynomials; neural networks; phonetic representation; phonetic segment; series of gestures; speech modeling; speech quality; transition model; transition-based speech synthesis; vocoder parameters; Computational efficiency; Computer networks; Equations; Mean square error methods; Network synthesis; Neural networks; Polynomials; Speech synthesis; Vocoders;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.859117
  • Filename
    859117