Voice characteristics conversion for TTS using reverse VTLN

Author

Eichner, Matthias ; Wolff, Matthias ; Hoffmann, Rüdiger

Author_Institution

Dresden Univ. of Technol., Germany

Volume

1

fYear

2004

fDate

17-21 May 2004

Abstract

In the past, several approaches have been proposed for voice conversion in TTS systems. Mostly, conversion is done by modification of the spectral properties and pitch to match a certain target voice. This conversion causes distortions that deteriorate the quality of the synthesized speech. In this paper we investigate a very simple and straightforward method for voice conversion. It generates a new voice from the source speaker instead of generating a certain target speaker´s voice. For application in TTS systems it is often sufficient to synthesize new voices that sound sufficiently different to be distinguishable from each other. This is done by applying a spectral warping technique that is commonly used for speaker normalization in speech recognition systems called vocal tract length normalization (VTLN). Due to the low requirements of resources this method is especially suited for embedded systems.

Keywords

embedded systems; spectral analysis; speech recognition; speech synthesis; TTS; embedded systems; reverse VTLN; source speaker; speaker normalization; spectral warping technique; speech recognition systems; vocal tract length normalization; voice characteristics conversion; Acoustic distortion; Character recognition; Databases; Embedded system; Loudspeakers; Signal processing; Signal synthesis; Speech recognition; Speech synthesis; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1325911

Filename

1325911