• DocumentCode
    3166436
  • Title

    Automatic pronunciation prediction for text-to-speech synthesis of dialectal arabic in a speech-to-speech translation system

  • Author

    Ananthakrishnan, Sankaranarayanan ; Tsakalidis, Stavros ; Prasad, Rohit ; Natarajan, Prem ; Vembu, Aravind Namandi

  • Author_Institution
    Language & Multimedia Unit, Raytheon BBN Technol., Cambridge, MA, USA
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4957
  • Lastpage
    4960
  • Abstract
    Text-to-speech synthesis (TTS) is the final stage in the speech-tospeech (S2S) translation pipeline, producing an audible rendition of translated text in the target language. TTS systems typically rely on a lexicon to look up pronunciations for each word in the input text. This is problematic when the target language is dialectal Arabic, because the statistical machine translation (SMT) system usually produces undiacritized text output. Many words in the latter possess multiple pronunciations; the correct choice must be inferred from context. In this paper, we present a weakly supervised pronunciation prediction approach for undiacritized dialectal Arabic in S2S systems that leverages automatic speech recognition (ASR) to obtain parallel training data for pronunciation prediction. Additionally, we show that incorporating source language features derived from SMT-generated automatic word alignment further improves automatic pronunciation prediction accuracy.
  • Keywords
    language translation; prediction theory; speech recognition; speech synthesis; ASR; S2S translation system; SMT system; SMT-generated automatic word alignment; TTS; automatic pronunciation prediction approach; automatic speech recognition; dialectal Arabic; lexicon-look up pronunciation; source language feature; speech-to-speech translation system; statistical machine translation system; text-to-speech synthesis; Accuracy; Error analysis; Hidden Markov models; Mathematical model; Predictive models; Speech; Training; dialectal arabic; pronunciation; speech synthesis; speech translation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6289032
  • Filename
    6289032