Title :
A new speech synthesis system based on the ARX speech production model
Author :
Zhu, Weizhong ; Kasuya, Hideki
Author_Institution :
Fac. of Eng., Utsunomiya Univ., Japan
Abstract :
We present a new formant-type speech analysis-synthesis system based on the ARX (Auto-Regressive with Exogenous Input) speech production model. The model consists of cascade formant-antiformant synthesizers driven by a voicing source and an unvoiced turbulent noise source. One of the key features of the proposed method is that we have an algorithm to automatically measure the voicing source, unvoiced source and formant-antiformant parameters of the synthesizer directly from natural speech waveforms. After having automatically obtained estimates of the parameters from natural speech, one can manipulate the estimates using a flexible editing tool that has been developed as a part of the system. By changing values of the fundamental frequency, glottal open quotient, spectral tilt parameter, turbulent noise level, formant-antiformant frequencies and bandwidths, we can synthesize natural sounding speech with various voice qualities including modal, breathy, tense, and whisper voice. Acoustic correlates of these voice qualities could be systematically investigated using the proposed system. Since our analysis-editing-synthesis system has been developed on the MS-Windows platform, it is expected that it will be a useful tool in various basic areas of speech science and technology
Keywords :
noise; parameter estimation; spectral analysis; speech processing; speech synthesis; statistical analysis; ARX speech production model; Auto-Regressive with Exogenous Input; MS-Windows; acoustic correlates; bandwidths; cascade formant-antiformant synthesizers; flexible editing tool; formant-type speech analysis system; fundamental frequency; glottal open quotient; natural sounding speech; natural speech; natural speech waveforms; parameter estimation; spectral tilt parameter; speech synthesis system; turbulent noise level; unvoiced turbulent noise source; voicing source; Acoustic noise; Bandwidth; Frequency synthesizers; Natural languages; Noise level; Parameter estimation; Production systems; Speech analysis; Speech enhancement; Speech synthesis;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607879