Title :
Restricted Boltzmann Machines and Deep Belief Networks on multi-core processors
Author :
Lopes, Nelson ; Ribeiro, Bernardete ; Gonçalves, João
Author_Institution :
CISUC-Center for Inf. & Syst., Univ. of Coimbra, Coimbra, Portugal
Abstract :
Deep learning architecture models by contrast with shallow models draw on the insights of biological inspiration which has been a challenge since the inception of the idea of simulating the brain. In particular their (many) hierarchical levels of composition track the development of parallel implementation in an attempt to become accessibly fast. When it comes to performance enhancement Graphics Processing Units (GPU) have carved their own strength in machine learning. In this paper, we present an approach that relies mainly on three kernels for implementing both the Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) algorithms. Instead of considering the neuron as the smallest unit of computation each thread represents the connection between two (one visible and one hidden) neurons. Although conceptually it may seem weird, the rationale behind is to think of a connection as performing a simple function that multiplies the clamped input by its weight. Thus, we maximize the GPU workload avoiding idle cores. Moreover, we placed great emphasis on the kernels to avoid uncoalesced memory accesses as well as to take advantage of the shared memory to reduce global memory accesses. Additionally, our approach uses a step adaptive learning rate procedure which accelerates convergence. The approach yields very good speedups (up to 46×) as compared with a straightforward implementation when both GPU and CPU implementations are tested on the MINST database.
Keywords :
Boltzmann machines; belief networks; graphics processing units; learning (artificial intelligence); multiprocessing systems; CPU implementations; DBN; GPU workload maximization; MINST database; RBM; biological inspiration; deep belief networks; deep learning architecture models; graphics processing units; machine learning; multicore processors; restricted Boltzmann machines; shallow models; step adaptive learning rate procedure; Computer architecture; Graphics processing unit; Instruction sets; Kernel; Neurons; Training; Vectors;
Conference_Titel :
Neural Networks (IJCNN), The 2012 International Joint Conference on
Conference_Location :
Brisbane, QLD
Print_ISBN :
978-1-4673-1488-6
Electronic_ISBN :
2161-4393
DOI :
10.1109/IJCNN.2012.6252431