DocumentCode
659449
Title
A parallel computing platform for training large scale neural networks
Author
Rong Gu ; Furao Shen ; Yihua Huang
Author_Institution
Nat. Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
fYear
2013
fDate
6-9 Oct. 2013
Firstpage
376
Lastpage
384
Abstract
Artificial neural networks (ANNs) have been proved to be successfully used in a variety of pattern recognition and data mining applications. However, training ANNs on large scale datasets are both data-intensive and computation-intensive. Therefore, large scale ANNs are used with reservation for their time-consuming training to get high precision. In this paper, we present cNeural, a customized parallel computing platform to accelerate training large scale neural networks with the backpropagation algorithm. Unlike many existing parallel neural network training systems working on thousands of training samples, cNeural is designed for fast training large scale datasets with millions of training samples. To achieve this goal, firstly, cNeural adopts HBase for large scale training dataset storage and parallel loading. Secondly, it provides a parallel in-memory computing framework for fast iterative training. Third, we choose a compact, event-driven messaging communication model instead of the heartbeat polling model for instant messaging delivery. Experimental results show that the overhead time cost by data loading and messaging communication is very low in cNeural and cNeural is around 50 times faster than the solution based on Hadoop MapReduce. It also achieves nearly linear scalability and excellent load balancing.
Keywords
backpropagation; iterative methods; learning (artificial intelligence); neural nets; parallel processing; ANNs; HBase; Hadoop MapReduce; artificial neural networks; backpropagation algorithm; cNeural; customized parallel computing platform; data mining; event-driven messaging communication model; fast iterative training; heartbeat polling model; instant messaging delivery; large scale neural network training; large scale training dataset storage; load balancing; parallel in-memory computing framework; parallel neural network training systems; pattern recognition; Biological neural networks; Loading; Neurons; Parallel processing; Training; Training data; big data; distributed storage; fast training; neural network; parallel computing;
fLanguage
English
Publisher
ieee
Conference_Titel
Big Data, 2013 IEEE International Conference on
Conference_Location
Silicon Valley, CA
Type
conf
DOI
10.1109/BigData.2013.6691598
Filename
6691598
Link To Document