مرکز منطقه ای اطلاع رساني علوم و فناوري - Removing linear phase mismatches in concatenative speech synthesis

DocumentCode :

1445795

Title :

Removing linear phase mismatches in concatenative speech synthesis

Author :

Stylianou, Yannis

Author_Institution :

Lucent Technol. Bell Labs., Murray Hill, NJ, USA

Volume :

Issue :

fYear :

2001

fDate :

3/1/2001 12:00:00 AM

Firstpage :

232

Lastpage :

239

Abstract :

Many current text-to-speech (TTS) systems are based on the concatenation of acoustic units of recorded speech. While this approach is believed to lead to higher intelligibility and naturalness than synthesis-by-rule, it has to cope with the issues of concatenating acoustic units that have been recorded at different times and in a different order. One important issue related to the concatenation of these acoustic units is their synchronization. In terms of signal processing this means removing linear phase mismatches between concatenated speech frames. This paper presents two novel approaches to the problem of synchronization of speech frames with an application to concatenative speech synthesis. Both methods are based on the processing of phase spectra without, however, decreasing the quality of the output speech, in contrast to previously proposed methods. The first method is based on the notion of center of gravity and the second on differentiated phase data. They are applied off-line, during the preparation of the speech database without, therefore, any computational burden on synthesis. The proposed methods have been tested with the harmonic plus noise model, HNM, and the TTS system of AT&T Labs. The resulting synthetic speech is free of linear phase mismatches

Keywords :

acoustic signal processing; hidden Markov models; noise; spectral analysis; speech intelligibility; speech synthesis; AT&T Labs; HNM; TTS system; acoustic units concatenation; center of gravity; concatenated speech frames; concatenative speech synthesis; differentiated phase data; harmonic plus noise model; linear phase mismatch removal; output speech quality; phase spectra processing; recorded speech; signal processing; speech database; speech intelligibility; synchronization; text-to-speech systems; Acoustic signal processing; Concatenated codes; Databases; Frequency synchronization; Gravity; Interpolation; Signal synthesis; Smoothing methods; Speech processing; Speech synthesis;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.905997

Filename :

905997

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1445795