Speech synthesis from real time ultrasound images of the tongue

Author

Denby, Bruce ; Stone, Maureen

Author_Institution

Lab. des Instruments et Systemes, Univ. Pierre et Marie Curie, Paris, France

Volume

fYear

2004

fDate

17-21 May 2004

Abstract

A machine learning technique is used to match reconstructed tongue contours in 30 frame per second ultrasound images to speaker vocal tract parameters obtained from a synchronized audio track. Speech synthesized using the learned parameters and noise as an activation function displays many of the time and frequency domain characteristics of the original audio, and, for isolated passages, is remarkably clear - although no articulators other than the tongue are included.

Keywords

biomedical ultrasonics; image reconstruction; image sequences; learning (artificial intelligence); medical image processing; speech synthesis; audio track; machine learning technique; medical ultrasound; real time ultrasound images; speech synthesis; tongue contours reconstruction; vocal tract parameters; Biomedical imaging; Data mining; Data visualization; GSM; Instruments; Speech codecs; Speech enhancement; Speech synthesis; Tongue; Ultrasonic imaging;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1326078

Filename

1326078

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3328454