A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks

Author

Gokhale, Vinayak ; Jonghoon Jin ; Dundar, Aysegul ; Martini, Ben ; Culurciello, Eugenio

Author_Institution

Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA

fYear

2014

fDate

23-28 June 2014

Firstpage

696

Lastpage

701

Abstract

Deep networks are state-of-the-art models used for understanding the content of images, videos, audio and raw input data. Current computing systems are not able to run deep network models in real-time with low power consumption. In this paper we present nn-X: a scalable, low-power coprocessor for enabling real-time execution of deep neural networks. nn-X is implemented on programmable logic devices and comprises an array of configurable processing elements called collections. These collections perform the most common operations in deep networks: convolution, subsampling and non-linear functions. The nn-X system includes 4 high-speed direct memory access interfaces to DDR3 memory and two ARM Cortex-A9 processors. Each port is capable of a sustained throughput of 950 MB/s in full duplex. nn-X is able to achieve a peak performance of 227 G-ops/s, a measured performance in deep learning applications of up to 200 G-ops/s while consuming less than 4 watts of power. This translates to a performance per power improvement of 10 to 100 times that of conventional mobile and desktop processors.

Keywords

coprocessors; neural nets; programmable logic devices; ARM Cortex-A9 processors; DDR3 memory; configurable processing elements; convolution operation; deep neural networks; desktop processors; memory access interface; mobile coprocessor; mobile processors; nn-X coprocessor; nonlinear function operation; power consumption; programmable logic devices; subsampling operation; Artificial neural networks; Convolution; Coprocessors; Memory management; Performance evaluation; Program processors; Computer vision; convolutional neural networks; embedded vision system; hardware acceleration; machine learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE Conference on

Conference_Location

Columbus, OH

Type

conf

DOI

10.1109/CVPRW.2014.106

Filename

6910056