Title :
A parallel computing platform for training large scale neural networks
Author :
Rong Gu ; Furao Shen ; Yihua Huang
Author_Institution :
Nat. Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
Abstract :
Artificial neural networks (ANNs) have been proved to be successfully used in a variety of pattern recognition and data mining applications. However, training ANNs on large scale datasets are both data-intensive and computation-intensive. Therefore, large scale ANNs are used with reservation for their time-consuming training to get high precision. In this paper, we present cNeural, a customized parallel computing platform to accelerate training large scale neural networks with the backpropagation algorithm. Unlike many existing parallel neural network training systems working on thousands of training samples, cNeural is designed for fast training large scale datasets with millions of training samples. To achieve this goal, firstly, cNeural adopts HBase for large scale training dataset storage and parallel loading. Secondly, it provides a parallel in-memory computing framework for fast iterative training. Third, we choose a compact, event-driven messaging communication model instead of the heartbeat polling model for instant messaging delivery. Experimental results show that the overhead time cost by data loading and messaging communication is very low in cNeural and cNeural is around 50 times faster than the solution based on Hadoop MapReduce. It also achieves nearly linear scalability and excellent load balancing.
Keywords :
backpropagation; iterative methods; learning (artificial intelligence); neural nets; parallel processing; ANNs; HBase; Hadoop MapReduce; artificial neural networks; backpropagation algorithm; cNeural; customized parallel computing platform; data mining; event-driven messaging communication model; fast iterative training; heartbeat polling model; instant messaging delivery; large scale neural network training; large scale training dataset storage; load balancing; parallel in-memory computing framework; parallel neural network training systems; pattern recognition; Biological neural networks; Loading; Neurons; Parallel processing; Training; Training data; big data; distributed storage; fast training; neural network; parallel computing;
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
DOI :
10.1109/BigData.2013.6691598