DocumentCode
180346
Title
X1000 real-time phoneme recognition VLSI using feed-forward deep neural networks
Author
Jonghong Kim ; Kyuyeon Hwang ; Wonyong Sung
Author_Institution
Dept. of Electr. & Comput. Eng., Seoul Nat. Univ., Seoul, South Korea
fYear
2014
fDate
4-9 May 2014
Firstpage
7510
Lastpage
7514
Abstract
Deep neural networks show very good performance in phoneme and speech recognition applications when compared to previously used GMM (Gaussian Mixture Model)-based ones. However, efficient implementation of deep neural networks is difficult because the network size needs to be very large when high recognition accuracy is demanded. In this work, we develop a digital VLSI for phoneme recognition using deep neural networks and assess the design in terms of throughput, chip size, and power consumption. The developed VLSI employs a fixed-point optimization method that only uses +Δ, 0, and -Δ for representing each of the weight. The design employs 1,024 simple processing units in each layer, which however can be scaled easily according to the needed throughput, and the throughput of the architecture varies from 62.5 to 1,000 times of the real-time processing speed.
Keywords
Gaussian processes; VLSI; feedforward neural nets; mixture models; optimisation; power consumption; speech recognition; GMM; chip size; digital VLSI; feed-forward deep neural networks; fixed-point optimization method; gaussian mixture model; high recognition accuracy; power consumption; real-time phoneme recognition VLSI; real-time processing speed; speech recognition applications; Clocks; Computer architecture; Neural networks; Real-time systems; Registers; Throughput; Very large scale integration; Deep neural network; VLSI; fixed-point optimization; phoneme recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6855060
Filename
6855060
Link To Document