• DocumentCode
    312301
  • Title

    A recurrent network that learns to pronounce English text

  • Author

    Adamson, M.J. ; Damper, R.I.

  • Author_Institution
    Dept. of Electron. & Comput. Sci., Southampton Univ., UK
  • Volume
    3
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    1704
  • Abstract
    Previous attempts to derive connectionist models for text-to-phoneme conversion-such as NETtalk and NETspeak-have generally used pre-aligned training data and purely feedforward networks, both of which represent simplifications of the problem. In this work, we explore the potential of recurrent networks to perform the conversion task when trained on non-aligned data. Initially, our use of a single recurrent network produced disappointing results. This led to the definition of a two-phase model in which the hidden-unit representation of an auto-associative network was fed forward to a recurrent network. Although this model currently does not perform as well as NETspeak it is solving a harder problem. Also, we propose several possible avenues for improvement
  • Keywords
    learning (artificial intelligence); natural language interfaces; performance evaluation; recurrent neural nets; speech synthesis; English text pronunciation; NETspeak; NETtalk; auto-associative network; connectionist models; feedforward networks; hidden-unit representation; learning; pre-aligned training data; recurrent network; text-to-phoneme conversion; two-phase model; Computer science; Damping; Image converters; Intelligent networks; Intelligent systems; Intersymbol interference; Multilayer perceptrons; Shock absorbers; Speech; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607955
  • Filename
    607955