مرکز منطقه ای اطلاع رساني علوم و فناوري - Data augmentation for deep convolutional neural network acoustic modeling

DocumentCode :

730710

Title :

Data augmentation for deep convolutional neural network acoustic modeling

Author :

Xiaodong Cui ; Goel, Vaibhava ; Kingsbury, Brian

Author_Institution :

IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4545

Lastpage :

4549

Abstract :

This paper investigates data augmentation based on label-preserving transformations for deep convolutional neural network (CNN) acoustic modeling to deal with limited training data. We show how stochastic feature mapping (SFM) can be carried out when training CNN models with log-Mel features as input and compare it with vocal tract length perturbation (VTLP). Furthermore, a two-stage data augmentation scheme with a stacked architecture is proposed to combine VTLP and SFM as complementary approaches. Improved performance has been observed in experiments conducted on the limited language pack (LLP) of Haitian Creole in the IARPA Babel program.

Keywords :

data handling; neural nets; speech processing; stochastic processes; CNN acoustic modeling; Haitian Creole; LLP; SFM; VTLP; data augmentation; deep convolutional neural network acoustic modeling; label preserving transformations; limited language pack; limited training data; log-Mel features; speech related applications; stacked architecture; stochastic feature mapping; vocal tract length perturbation; Acoustics; Adaptation models; Atmospheric modeling; Feedforward neural networks; bottleneck features; convolutional neural networks; data augmentation; stochastic feature mapping; vocal tract length perturbation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178831

Filename :

7178831

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=730710