• DocumentCode
    394296
  • Title

    Trainable Cantonese/English dual language speech synthesis system

  • Author

    Li, Haiping ; Chen, Fangxin ; Shen, Li Qin ; Ma, Xi Jun

  • Author_Institution
    IBM China Res. Lab., Beijing, China
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    The paper introduces a Cantonese/English dual language text to speech (TTS) system. It was developed on IBM´s trainable TTS technology, which uses trainable statistical models to automate speech data processing and selection. The Cantonese and English phonological, syntactic and prosodic rules were built into a dual-language delta module, which processes the mixed-language input accordingly and generates mixed Cantonese and English speech with coherent prosody. To approximate the speaker´s characteristics, a speaker prosody profile was extracted from the dataset and incorporated into the delta speech rule processing for the enhancement of duration, lexical tone and intonation prediction. In the selection of the concatenative unit set, we experimented with different Cantonese syllable decomposition schemes. Though this system is currently only implemented for Cantonese, it can be easily adapted to other tonal languages.
  • Keywords
    learning (artificial intelligence); natural languages; prediction theory; speech synthesis; statistical analysis; text analysis; Cantonese speech synthesis system; Cantonese/English dual language speech synthesis system; English speech synthesis system; concatenative unit set; delta speech rule processing; duration prediction; intonation prediction; lexical tone prediction; phonological rules; prosodic rules; speaker prosody profile; speech data processing; syntactic rules; trainable TTS technology; trainable speech synthesis system; trainable statistical models; Data mining; Data processing; Iron; Labeling; Loudspeakers; Natural languages; Signal synthesis; Speech enhancement; Speech processing; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198829
  • Filename
    1198829