Forward masking phenomenon in concatenative speech synthesis

Author

M. Cernak;G. Rozinaj

Author_Institution

Fac. of Electr. Eng., Slovak Tech. Univ., Bratislava, Slovakia

Volume

fYear

2003

fDate

6/25/1905 12:00:00 AM

Firstpage

691

Abstract

The approach described in the paper tries to get more knowledge to the concatenative text-to-speech system design. The knowledge is based on masking phenomenon of the inner ear, particularly of its temporal (forward) masking properties. Designing such knowledge-based system is suggested to use in the unit selection-based speech synthesis, as contemporary a prominent technique in concatenative synthesis, which utilizes a big speech corpus. The more prosodic variability the corpus captures, the more natural a synthetic voice sounds and there are more possibilities to occur a forward masking events during concatenation of selected candidate units from the corpus.

Keywords

"Speech synthesis","Humans","Speech analysis","Speech coding","Frequency","Cost function","Knowledge based systems","Speech processing","Cepstral analysis","Ear"

Publisher

ieee

Conference_Titel

Video/Image Processing and Multimedia Communications, 2003. 4th EURASIP Conference focused on

Print_ISBN

953-184-054-7

Type

conf

DOI

10.1109/VIPMC.2003.1220544

Filename

1220544

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3614533