Title :
Speech synthesis using subband-coded multiband source components and sinusoids
Author :
Nishizawa, N. ; Kato, Toshihiko
Author_Institution :
KDDI R&D Labs. Inc., Fujimino, Japan
Abstract :
An improved speech waveform generation method for speech synthesizers using filter banks is proposed where spectral features of synthetic sounds are constructed by amplitude modification and summation of predecomposed source waveforms. In the method, since all operations are performed in the subband-coded domain with a reduced sampling rate, the computational cost can also be reduced. Moreover, to improve the accuracy of spectral reproduction in low frequency domain of voiced sounds, sinusoidal synthesis directly performed on low subbands is also introduced. The result of a subjective test using resynthesized sounds spoken by a male and female narrator indicated that the proposed method was significantly superior to the conventional methods using a mel log spectrum approximation (MLSA) filter and nonmaximally decimated filter bank, which was our previously proposed method.
Keywords :
approximation theory; channel bank filters; speech synthesis; MLSA filter; amplitude modification; decimated filter bank; filter banks; mel log spectrum approximation; predecomposed source waveforms; sinusoidal synthesis; spectral features; spectral reproduction; speech synthesis; speech waveform generation method; subband coded multiband source components; subband coded multiband source sinusoids; synthetic sounds; voiced sounds; Encoding; Hidden Markov models; Prototypes; Speech; Speech synthesis; Synthesizers; Vectors; HMM-based speech synthesis; embedded systems; filter bank; speech waveform generation; subband coding;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639223