• DocumentCode
    312236
  • Title

    A new speech synthesis system based on the ARX speech production model

  • Author

    Zhu, Weizhong ; Kasuya, Hideki

  • Author_Institution
    Fac. of Eng., Utsunomiya Univ., Japan
  • Volume
    3
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    1413
  • Abstract
    We present a new formant-type speech analysis-synthesis system based on the ARX (Auto-Regressive with Exogenous Input) speech production model. The model consists of cascade formant-antiformant synthesizers driven by a voicing source and an unvoiced turbulent noise source. One of the key features of the proposed method is that we have an algorithm to automatically measure the voicing source, unvoiced source and formant-antiformant parameters of the synthesizer directly from natural speech waveforms. After having automatically obtained estimates of the parameters from natural speech, one can manipulate the estimates using a flexible editing tool that has been developed as a part of the system. By changing values of the fundamental frequency, glottal open quotient, spectral tilt parameter, turbulent noise level, formant-antiformant frequencies and bandwidths, we can synthesize natural sounding speech with various voice qualities including modal, breathy, tense, and whisper voice. Acoustic correlates of these voice qualities could be systematically investigated using the proposed system. Since our analysis-editing-synthesis system has been developed on the MS-Windows platform, it is expected that it will be a useful tool in various basic areas of speech science and technology
  • Keywords
    noise; parameter estimation; spectral analysis; speech processing; speech synthesis; statistical analysis; ARX speech production model; Auto-Regressive with Exogenous Input; MS-Windows; acoustic correlates; bandwidths; cascade formant-antiformant synthesizers; flexible editing tool; formant-type speech analysis system; fundamental frequency; glottal open quotient; natural sounding speech; natural speech; natural speech waveforms; parameter estimation; spectral tilt parameter; speech synthesis system; turbulent noise level; unvoiced turbulent noise source; voicing source; Acoustic noise; Bandwidth; Frequency synthesizers; Natural languages; Noise level; Parameter estimation; Production systems; Speech analysis; Speech enhancement; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607879
  • Filename
    607879