DocumentCode :
134308
Title :
Mapping between ultrasound and vowel speech using DNN framework
Author :
Xinyuan Zheng ; Jianguo Wei ; Wenhuan Lu ; Qiang Fang ; Jianwu Dang
Author_Institution :
Sch. of Comput. Sci. & Technol., Tianjin Univ., Tianjin, China
fYear :
2014
fDate :
12-14 Sept. 2014
Firstpage :
372
Lastpage :
376
Abstract :
Building up the mapping between articulatory movements and corresponding speech could great facility the speech training and speech aid for voiceless patients. In this paper, we propose a deep learning framework for building up a mapping between articulatory information and corresponding speech, which were recorded by ultrasound system. The dataset includes six Chinese vowels. We use Bimodal Deep Autoencoder algorithm based on RBM to learn the relationship between speech and articulation, the weights matrix of representation of them. Speech and ultrasound images have been reconstructed using the extracted features. The reconstruction error of articulation by our method is less than that of PCA based approach. The reconstructed speech is similar to the original one. We propose a mapping from ultrasound tongue image to acoustic signal with a revised Denoising Autoencoder, the results show that it is a promising approach. In contrast, another experiment is conducted to synthesize the ultrasound tongue image from the speech, but the result should be improved.
Keywords :
acoustic signal processing; feature extraction; handicapped aids; image reconstruction; natural language processing; neural nets; speech processing; ultrasonic imaging; Chinese vowels; DNN framework; RBM; acoustic signal; articulatory movements; bimodal deep autoencoder algorithm; deep neural network framework; denoising autoencoder; feature extraction; speech aid; speech reconstruction; speech training; ultrasound image reconstruction; ultrasound system; ultrasound tongue image; voiceless patients; vowel speech; Acoustics; Feature extraction; Image reconstruction; Speech; Synchronization; Tongue; Ultrasonic imaging; DBN; Denoising Autoencoder; articulatory-acoustic mapping;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/ISCSLP.2014.6936700
Filename :
6936700
Link To Document :
بازگشت