Title :
The broad study of homograph disambiguity for Mandarin speech synthesis
Author :
Wang, Wern-Jun ; Hwang, Shaw-Hwa ; Chen, Sin-Horng
Author_Institution :
Nat. Chiao Tung Univ., Hsinchu, Taiwan
Abstract :
How to increase the intelligibility and naturalness of synthetic speech has drawn much attention in Mandarin text-to-speech (TTS) research. This has always been treated as a bottleneck due to their effects on human perception. However, as qualities of synthetic speech increase for syllables, words or phrases, there is also an increasing need to improve the various components of the text processing. One of these desired improvements for Mandarin speech synthesis is the accuracy of the character-to-sound (CTS) process. From the viewpoint of application, the purpose of speech synthesis should be aimed at making the synthetic speech understandable by humans and minimizing the misunderstanding between them. It thus is very important to increase the accuracy of the CTS process. Such a process is designed to predict phonetic pronunciations from a coarse surface text input and the difficulty mainly results from ambiguous homograph characters. We propose an effective analysis method incorporated with linguistic knowledge to resolve homograph ambiguity. The methods we use in the experiments are discriminating lexical association and a tree-based language model. From the experiment results, we can get about 10% more improvement on the average accuracy rate than the traditional maximum frequency guess approach for most ambiguous homograph characters
Keywords :
linguistics; natural language interfaces; performance evaluation; speech intelligibility; speech synthesis; trees (mathematics); Mandarin speech synthesis; ambiguous homograph character; character-to-sound process; discriminating lexical association; homograph disambiguity; human perception; linguistic knowledge; maximum frequency guess approach; phonetic pronunciations; phrases; speech intelligibility; speech naturalness; syllables; synthetic speech; text processing; text-to-speech; tree-based language model; words; Dictionaries; Humans; Laboratories; Natural languages; Process design; Real time systems; Speech processing; Speech synthesis; Tagging; Text processing;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607873