Title :
Parallel back-propagation neural network training technique using CUDA on multiple GPUs
Author :
Shunlu Zhang;Pavan Gunupudi;Qi-Jun Zhang
Author_Institution :
Department of Electronics, Carleton University, Ottawa, Ontario, Canada, K1B5S6
Abstract :
A parallel Back-Propagation(BP) neural network training technique using Compute Unified Device Architecture (CUDA) on multiple Graphics Processing Units(GPUs) is proposed. To exploit the maximum performance of GPUs, we propose to implement batch mode BP training by building input neurons, hidden neurons and output neurons into matrix form. The implementation includes CUDA Basic Linear Algebra Subroutines (cuBLAS) function to perform matrix and vector operations and CUDA kernel. The proposed technique utilizes multiple GPUs to achieve further acceleration. Each GPU has the same neural network structure and weight parameter. The number of training samples are distributed to multiple GPUs. Each GPU calculates local training error and the gradient at each layer then transferred to the first GPU to calculate the summations. The summations are transferred back to each GPU to update the local weights until the training goal is achieved. A cavity microwave bandpass filter example is used to illustrate the validity of this technique.
Keywords :
"Training","Graphics processing units","Neurons","Biological neural networks","Microwave theory and techniques","Cavity resonators"
Conference_Titel :
Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO), 2015 IEEE MTT-S International Conference on
DOI :
10.1109/NEMO.2015.7415056