مرکز منطقه ای اطلاع رساني علوم و فناوري - Optimizing speech synthesizer memory footprint through phoneme set reduction

DocumentCode :

1937596

Title :

Optimizing speech synthesizer memory footprint through phoneme set reduction

Author :

Moberg, Marko ; Viikki, Olli

Author_Institution :

Speech & Audio Syst. Lab., Nokia Res. Center, Tampere, Finland

fYear :

2002

fDate :

11-13 Sept. 2002

Firstpage :

171

Lastpage :

174

Abstract :

The embedded device market is currently searching for low memory footprint solutions to enable the use of speech technology, including speech synthesis, in mass products. The amount of memory consumed has a direct impact on the product manufacturing costs therefore every means to save memory should be exploited. In speech synthesis, some memory saving can be achieved by reducing the number of phonemes in a given language. According to the listening evaluation test, certain affricates, diphthongs and long vowels in USA-English can be expressed as a combination of two other phonemes. The improved or equal intelligibility and quality were achieved by adding one new phoneme to the phoneme set and by simultaneously removing four of the original phonemes, /tS/, /e/, /O/ and /OI/. The net decrease in the number of phonemes reduced the memory required to store Klatt88 synthesis parameters by 7% and the memory needed for speech database in diphone concatenation synthesis by approximately 10%. More substantial saving in the memory size can be achieved if small degradation of quality and intelligibility is accepted.

Keywords :

embedded systems; speech intelligibility; speech processing; speech synthesis; Klatt88 synthesis parameters; USA-English; affricates; diphone concatenation synthesis; diphthongs; embedded device; intelligibility; long vowels; memory footprint; memory saving; phoneme combination; phoneme set reduction; speech quality; speech synthesis; Audio systems; Costs; Databases; Handheld computers; Laboratories; Manufacturing; Natural languages; Speech synthesis; Synthesizers; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on

Print_ISBN :

0-7803-7395-2

Type :

conf

DOI :

10.1109/WSS.2002.1224401

Filename :

1224401

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1937596