DocumentCode :
779838
Title :
Variable-Length Unit Selection in TTS Using Structural Syntactic Cost
Author :
Wu, Chung-Hsien ; Hsia, Chi-Chun ; Chen, Jiun-Fu ; Wang, Jhing-Fa
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan
Volume :
15
Issue :
4
fYear :
2007
fDate :
5/1/2007 12:00:00 AM
Firstpage :
1227
Lastpage :
1235
Abstract :
This paper presents a variable-length unit selection scheme based on syntactic cost to select text-to-speech (TTS) synthesis units. The syntactic structure of a sentence is derived from a probabilistic context-free grammar (PCFG), and represented as a syntactic vector. The syntactic difference between target and candidate units (words or phrases) is estimated by the cosine measure with the inside probability of PCFG acting as a weight. Latent semantic analysis (LSA) is applied to reduce the dimensionality of the syntactic vectors. The dynamic programming algorithm is adopted to obtain a concatenated unit sequence with minimum cost. A syntactic property-rich speech database is designed and collected as the unit inventory. Several experiments with statistical testing are conducted to assess the quality of the synthetic speech as perceived by human subjects. The proposed method outperforms the synthesizer without considering syntactic property. The structural syntax estimates the substitution cost better than the acoustic features alone
Keywords :
dynamic programming; speech synthesis; statistical testing; TTS; dynamic programming algorithm; latent semantic analysis; probabilistic context-free grammar; statistical testing; structural syntactic cost; structural syntax; text-to-speech synthesis; variable-length unit selection; Computer science; Concatenated codes; Costs; Dynamic programming; Heuristic algorithms; Humans; Spatial databases; Speech synthesis; Statistical analysis; Synthesizers; Latent semantic analysis (LSA); probabilistic context-free grammar (PCFG); speech synthesis; syntactic structure; variable-length unit selection;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2006.889752
Filename :
4156186
Link To Document :
بازگشت