مرکز منطقه ای اطلاع رساني علوم و فناوري - Mapping between ultrasound and vowel speech using DNN framework

DocumentCode :

134308

Title :

Mapping between ultrasound and vowel speech using DNN framework

Author :

Xinyuan Zheng ; Jianguo Wei ; Wenhuan Lu ; Qiang Fang ; Jianwu Dang

Author_Institution :

Sch. of Comput. Sci. & Technol., Tianjin Univ., Tianjin, China

fYear :

2014

fDate :

12-14 Sept. 2014

Firstpage :

372

Lastpage :

376

Abstract :

Building up the mapping between articulatory movements and corresponding speech could great facility the speech training and speech aid for voiceless patients. In this paper, we propose a deep learning framework for building up a mapping between articulatory information and corresponding speech, which were recorded by ultrasound system. The dataset includes six Chinese vowels. We use Bimodal Deep Autoencoder algorithm based on RBM to learn the relationship between speech and articulation, the weights matrix of representation of them. Speech and ultrasound images have been reconstructed using the extracted features. The reconstruction error of articulation by our method is less than that of PCA based approach. The reconstructed speech is similar to the original one. We propose a mapping from ultrasound tongue image to acoustic signal with a revised Denoising Autoencoder, the results show that it is a promising approach. In contrast, another experiment is conducted to synthesize the ultrasound tongue image from the speech, but the result should be improved.

Keywords :

acoustic signal processing; feature extraction; handicapped aids; image reconstruction; natural language processing; neural nets; speech processing; ultrasonic imaging; Chinese vowels; DNN framework; RBM; acoustic signal; articulatory movements; bimodal deep autoencoder algorithm; deep neural network framework; denoising autoencoder; feature extraction; speech aid; speech reconstruction; speech training; ultrasound image reconstruction; ultrasound system; ultrasound tongue image; voiceless patients; vowel speech; Acoustics; Feature extraction; Image reconstruction; Speech; Synchronization; Tongue; Ultrasonic imaging; DBN; Denoising Autoencoder; articulatory-acoustic mapping;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location :

Singapore

Type :

conf

DOI :

10.1109/ISCSLP.2014.6936700

Filename :

6936700

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=134308