• DocumentCode
    394338
  • Title

    Recent improvements to the IBM trainable speech synthesis system

  • Author

    Eide, E. ; Aaron, A. ; Bakis, R. ; Cohen, P. ; Donovan, R. ; Hamza, W. ; Mathes, T. ; Picheny, M. ; Polkosky, M. ; Smith, M. ; Viswanathan, M.

  • Author_Institution
    IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    In this paper we describe the current status of the trainable text-to-speech system at IBM. Recent algorithmic and database changes to the system have led to significant gains in the output quality. On the algorithms side, we have introduced statistical models for predicting pitch and duration targets which replace the rule-based target generation previously employed. Additionally, we have changed the cost function and the search strategy, introduced a post-search pitch smoothing algorithm, and improved our method of preselection. Through the combined data and algorithmic contributions, we have been able to significantly improve (p < 0.0001) the mean opinion score (MOS) of our female voice, from 3.68 to 4.85 when heard over loudspeakers and to 5.42 when heard over the telephone (seven point scale).
  • Keywords
    frequency estimation; prediction theory; search problems; smoothing methods; speech synthesis; statistical analysis; IBM trainable speech synthesis system; algorithmic changes; cost function; database changes; duration; mean opinion score; output quality; pitch prediction; post-search pitch smoothing algorithm; preselection; search strategy; statistical models; text-to-speech system; Cost function; Databases; Decision trees; Knowledge based systems; Signal generators; Signal processing algorithms; Smoothing methods; Speech processing; Speech synthesis; Stress;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198879
  • Filename
    1198879