DocumentCode
134308
Title
Mapping between ultrasound and vowel speech using DNN framework
Author
Xinyuan Zheng ; Jianguo Wei ; Wenhuan Lu ; Qiang Fang ; Jianwu Dang
Author_Institution
Sch. of Comput. Sci. & Technol., Tianjin Univ., Tianjin, China
fYear
2014
fDate
12-14 Sept. 2014
Firstpage
372
Lastpage
376
Abstract
Building up the mapping between articulatory movements and corresponding speech could great facility the speech training and speech aid for voiceless patients. In this paper, we propose a deep learning framework for building up a mapping between articulatory information and corresponding speech, which were recorded by ultrasound system. The dataset includes six Chinese vowels. We use Bimodal Deep Autoencoder algorithm based on RBM to learn the relationship between speech and articulation, the weights matrix of representation of them. Speech and ultrasound images have been reconstructed using the extracted features. The reconstruction error of articulation by our method is less than that of PCA based approach. The reconstructed speech is similar to the original one. We propose a mapping from ultrasound tongue image to acoustic signal with a revised Denoising Autoencoder, the results show that it is a promising approach. In contrast, another experiment is conducted to synthesize the ultrasound tongue image from the speech, but the result should be improved.
Keywords
acoustic signal processing; feature extraction; handicapped aids; image reconstruction; natural language processing; neural nets; speech processing; ultrasonic imaging; Chinese vowels; DNN framework; RBM; acoustic signal; articulatory movements; bimodal deep autoencoder algorithm; deep neural network framework; denoising autoencoder; feature extraction; speech aid; speech reconstruction; speech training; ultrasound image reconstruction; ultrasound system; ultrasound tongue image; voiceless patients; vowel speech; Acoustics; Feature extraction; Image reconstruction; Speech; Synchronization; Tongue; Ultrasonic imaging; DBN; Denoising Autoencoder; articulatory-acoustic mapping;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location
Singapore
Type
conf
DOI
10.1109/ISCSLP.2014.6936700
Filename
6936700
Link To Document