• DocumentCode
    454614
  • Title

    Prosody Generation for Speech-to-Speech Translation

  • Author

    Agüero, Pablo Daniel ; Adell, Jordi ; Bonafonte, Antonio

  • Author_Institution
    TALP Res. Center, Univ. Politecnica de Catalunya
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    This paper deals with speech synthesis in the framework of speech-to-speech translation. Our current focus is to translate speeches or conversations between humans so that a third person can listen to them in its own language. In this framework the style is not written but spoken and the original speech includes a lot of non-linguistic information (as speaker emotion). In this work we propose the use of prosodic features in the original speech to produce prosody in the target language. Relevant features are found using an unsupervised clustering algorithm that finds, in a bilingual speech corpus, intonation clusters in the source speech which are relevant in the target speech. Preliminary results already show a significant improvement in the synthetic quality (from MOS=3.40 to MOS=3.65)
  • Keywords
    language translation; pattern clustering; speech synthesis; bilingual speech corpus; intonation clusters; prosody generation; speech synthesis; speech-to-speech translation; unsupervised clustering algorithm; Broadcasting; Clustering algorithms; Humans; Knowledge representation; Natural languages; Speech recognition; Speech synthesis; Target recognition; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1660081
  • Filename
    1660081