مرکز منطقه ای اطلاع رساني علوم و فناوري - Estimate articulatory MRI series from acoustic signal using deep architecture

DocumentCode :

3431070

Title :

Estimate articulatory MRI series from acoustic signal using deep architecture

Author :

Hao Li ; Jianhua Tao ; Minghao Yang ; Bin Liu

Author_Institution :

Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4854

Lastpage :

4858

Abstract :

This paper presents our work on acoustic-to-articulatory inversion mapping, in which, the articulatory data is the MRI series for articulators on mid-sagittal plan. Deep architectures based on restricted Boltzmann machine (RBM) and linear regression are employed to construct the audio-visual mapping. We test two architectures to initialize the neural network: the bottom-up stacked RBM with top regression layer architecture and the one with extra Gaussian-Bernoulli RBM on the top of the former architecture. GMM-based mapping is used as baseline method. The MRI data from USC-TIMIT database is used for the training. The experimental results show that the deep regression network is an effective model to construct the mapping from acoustic speech signal to articulatory MRI series, and also indicate that it is a better strategy to initial the top layer as Gaussian-Bernoulli RBM to compress the MRI data before the liner regression.

Keywords :

Boltzmann machines; acoustic signal processing; audio databases; magnetic resonance imaging; speech processing; MRI data; RBM; USC-TIMIT database; acoustic signal; acoustic speech signal; acoustic-to-articulatory inversion mapping; audio-visual mapping; deep architecture; deep architectures; deep regression network; estimate articulatory MRI series; extra Gaussian-Bernoulli RBM; linear regression; midsagittal plan; neural network; restricted Boltzmann machine; top regression layer architecture; Acoustics; Head; Linear regression; Magnetic resonance imaging; Speech; Tongue; Training; MRI; acoustic-to-articulatory inversion; deep neural network; deep regression network;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178893

Filename :

7178893

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3431070