مرکز منطقه ای اطلاع رساني علوم و فناوري - Stochastic data sweeping for fast DNN training

DocumentCode :

177478

Title :

Stochastic data sweeping for fast DNN training

Author :

Wei Deng ; Yanmin Qian ; Yuchen Fan ; Tianfan Fu ; Kai Yu

Author_Institution :

Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

240

Lastpage :

244

Abstract :

Context-dependent deep neural network (CD-DNN) has been successfully used in large vocabulary continuous speech recognition (LVCSR). However the immense computational cost of the mini-batch based back-propagation (BP) training has become a major block to utilize massive speech data for DNN training. Previous works on BP training acceleration mainly focus on parallelization with multiple GPUs. In this paper, a novel stochastic data sweeping (SDS) framework is proposed from a different perspective to speed up DNN training with a single GPU. Part of the training data is randomly selected from the whole set and the quantity is gradually reduced at each training epoch. SDS utilizes less data in the entire process and consequently save tremendous training time. Since SDS works at data level, it is complementary to parallel training strategies and can be integrated to form a much faster training framework. Experiments showed that, combining SDS with asynchronous stochastic gradient descent (ASGD) can achieve almost 3.0 times speed-up on 2 GPUs at no loss of recognition accuracy.

Keywords :

backpropagation; neural nets; speech recognition; stochastic processes; ASGD; BP training acceleration; CD-DNN; LVCSR; SDS framework; asynchronous stochastic gradient descent; context-dependent deep neural network; fast DNN training data; immense computational cost; large vocabulary continuous speech recognition; massive speech data; minibatch based back-propagation training; multiple GPUs; parallel training strategy; stochastic data sweeping framework; training epoch; Graphics processing units; Hidden Markov models; Speech; Speech recognition; Stochastic processes; Training; Training data; Asynchronous SGD; Deep neural network; GPU; Speech recognition; Stochastic Data Sweeping;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6853594

Filename :

6853594

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=177478