Practical high-quality speech and voice synthesis using fixed frame rate ABS/OLA sinusoidal modeling

Author

George, E. Bryan

Author_Institution

DSP Solutions Res. & Dev. Center, Texas Instrum. Inc., Dallas, TX, USA

Volume

1

fYear

1998

fDate

12-15 May 1998

Firstpage

301

Abstract

This paper describes algorithms developed to apply the analysis-by-synthesis/overlap-add (ABS/OLA) sinusoidal modeling system to real-time speech and singing voice synthesis. As originally proposed, the ABS/OLA system is limited to unidirectional time-scaling, and relies on variable frame length to accomplish time-scale modification. For speech and voice synthesis applications, unidirectional time scaling makes effective looping to produce sustained vocal sounds difficult, and variable frame length makes real-time polyphonic synthesis problematic. This paper presents a reformulation of the basic ABS/OLA system to deal with these issues, which is termed fixed-rate ABS/OLA (ABS/OLA-FR)

Keywords

interpolation; parameter estimation; speech intelligibility; speech synthesis; ABS/OLA-FR; FFT; algorithms; analysis-by-synthesis/overlap-add; fixed frame rate ABS/OLA sinusoidal modeling; fixed-rate ABS/OLA; high-quality speech synthesis; parameter interpolation; real-time polyphonic synthesis; real-time speech synthesis; singing voice synthesis; sustained vocal sounds; time-scale modification; unidirectional time scaling; unidirectional time-scaling; variable frame length; Algorithm design and analysis; Digital signal processing; Instruments; Music; Real time systems; Research and development; Signal synthesis; Speech analysis; Speech synthesis; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.674427

Filename

674427