مرکز منطقه ای اطلاع رساني علوم و فناوري - An evaluation of automatic phone segmentation for concatenative speech synthesis

DocumentCode :

417236

Title :

An evaluation of automatic phone segmentation for concatenative speech synthesis

Author :

Kawai, Hisashi ; Toda, Tomoki

Author_Institution :

ATR Spoken Language Translation Res. Labs., Japan

Volume :

fYear :

2004

fDate :

17-21 May 2004

Abstract :

This paper studies the performance of automatic phone segmentation from two viewpoints: temporal precision and the effect on the naturalness of synthetic speech. The absolute error of the phone onset time for the best 90% and worst 10% were 4.6 ms and 25.9 ms, respectively. These values are comparable to discrepancies among human labelers. As the result of perception tests in which naturalness was pair-compared between synthetic speeches generated from hand-segmented data and from auto-segmented data, it was found that the latter is statistically inferior.

Keywords :

hidden Markov models; speech processing; speech synthesis; 25.9 ms; 4.6 ms; HMM; TTS; acoustic analysis; auto-segmented data; automatic phone segmentation; concatenative speech synthesis; hand-segmented data; hidden Markov models; linguistic analysis; natural synthetic speech; perception tests; phone onset time error; segmented speech corpora; temporal precision; text to speech synthesis; Automatic testing; Context modeling; Costs; Degradation; Hidden Markov models; Humans; Laboratories; Natural languages; Speech synthesis; Synthesizers;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN :

1520-6149

Print_ISBN :

0-7803-8484-9

Type :

conf

DOI :

10.1109/ICASSP.2004.1326076

Filename :

1326076

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=417236