• DocumentCode
    2838906
  • Title

    An acoustic and articulatory knowledge integrated method for improving synthetic Mandarin speech´s fluency

  • Author

    Hung-Yan Gu ; Wang, Kuo-Hsian

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ. of Sci. & Technol., Taipei, Taiwan
  • fYear
    2004
  • fDate
    15-18 Dec. 2004
  • Firstpage
    205
  • Lastpage
    208
  • Abstract
    In synthetic Mandarin speech, discontinuity of formant traces at syllable boundaries is a key factor that lowers the fluency level. Therefore, we study an acoustic and articulatory knowledge integrated method to solve this discontinuity problem. First, representative trisyllable contexts are selected and their signals are recorded. The signal of the middle syllable of each trisyllable pronunciation is then extracted to make a synthesis unit. To select a synthesis unit among multiple candidates, a distance function is defined to measure the spectral similarity between two synthesis units to be concatenated. In addition, several linking-restriction rules are derived, according to articulatory knowledge, to prevent some synthesis units being linked into a sequence. Then, a globally best synthesis-unit sequence is searched by using a dynamic programming based algorithm. When this method is applied, the formant traces at syllable boundaries become smoother. Also, subjective evaluation shows that the fluency level of synthetic Mandarin speech can indeed be improved a lot.
  • Keywords
    dynamic programming; knowledge based systems; natural languages; speech; speech synthesis; acoustic knowledge; articulatory knowledge; distance function; dynamic programming; formant trace discontinuities; linking-restriction rules; spectral similarity; syllable boundaries; synthesis unit; synthetic Mandarin speech fluency; trisyllable contexts; Acoustic testing; Acoustical engineering; Computer science; Concatenated codes; Dynamic programming; Heuristic algorithms; Signal synthesis; Speech analysis; Speech processing; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing, 2004 International Symposium on
  • Print_ISBN
    0-7803-8678-7
  • Type

    conf

  • DOI
    10.1109/CHINSL.2004.1409622
  • Filename
    1409622