DocumentCode :
1329440
Title :
High-Performance Reconfigurable Hardware Architecture for Restricted Boltzmann Machines
Author :
Le Ly, Daniel ; Chow, P.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada
Volume :
21
Issue :
11
fYear :
2010
Firstpage :
1780
Lastpage :
1792
Abstract :
Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can exploit the inherent parallelism in neural networks is desired. This paper investigates how the restricted Boltzmann machine (RBM), which is a popular type of neural network, can be mapped to a high-performance hardware architecture on field-programmable gate array (FPGA) platforms. The proposed modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. A method to partition large RBMs into smaller congruent components is also presented, allowing the distribution of one RBM across multiple FPGA resources. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100 MHz through a variety of different configurations. The maximum performance was obtained by instantiating an RBM of 256 256 nodes distributed across four FPGAs, which resulted in a computational speed of 3.13 billion connection-updates-per-second and a speedup of 145-fold over an optimized C program running on a 2.8-GHz Intel processor.
Keywords :
Boltzmann machines; C language; computational complexity; field programmable gate arrays; reconfigurable architectures; C program; Intel processor; Xilinx Virtex II-Pro XC2VP70 FPGA; field-programmable gate array platforms; general-purpose processors; high-performance reconfigurable hardware architecture; neural networks; restricted Boltzmann machines; time complexity; Artificial neural networks; Complexity theory; Computer architecture; Equations; Field programmable gate arrays; Hardware; Program processors; Boltzmann machines; computer architecture; field-programmable gate arrays; neural network hardware; parallel processing; Artificial Intelligence; Computer Simulation; Computers; Neural Networks (Computer); Programming Languages; Software Validation;
fLanguage :
English
Journal_Title :
Neural Networks, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9227
Type :
jour
DOI :
10.1109/TNN.2010.2073481
Filename :
5580081
Link To Document :
بازگشت