DocumentCode :
1653173
Title :
Experimental study of structure to speech conversion
Author :
Minematsu, Nobuaki ; Saito, Daisuke ; Hirose, Keikichi
Author_Institution :
Univ. of Tokyo, Tokyo
fYear :
2008
Firstpage :
651
Lastpage :
654
Abstract :
Most of the speech synthesizers have been developed as text (phoneme sequence) to speech converters and, in this framework, text input is a precondition for speech production. However, we can say that no child acquires spoken language by reading a given text out. Children are explained to acquire spoken language by imitating the utterances of their parents but they never imitate the voices of their parents. Developmental psychology claims that they extract a holistic and speaker-invariant sound pattern embedded in a given utterance, called word Gestalt, and realize the pattern acoustically using their short vocal tubes. In our previous studies, we mathematically defined this holistic and speaker-invariant pattern and used it for ASR. Here, we experimentally implement its inverse process, i.e. Gestalt-to-utterance conversion, on a computer.
Keywords :
natural language processing; speech synthesis; developmental psychology; infant-like vocal imitation; phoneme sequence; speaker-invariant pattern; speaker-invariant sound pattern; speech conversion; speech converter; speech production; speech synthesizer; text input; word Gestalt; Automatic speech recognition; Birds; Loudspeakers; Matrix converters; Natural languages; Psychology; Robustness; Shape; Speech recognition; Speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing, 2008. ICSP 2008. 9th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-2178-7
Electronic_ISBN :
978-1-4244-2179-4
Type :
conf
DOI :
10.1109/ICOSP.2008.4697215
Filename :
4697215
Link To Document :
بازگشت