Direct product based deep belief networks for automatic speech recognition

Author

Fousek, Petr ; Rennie, Steven ; Dognin, Pierre ; Goel, Vikas

Author_Institution

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear

2013

Firstpage

3148

Lastpage

3152

Abstract

In this paper, we present new methods for parameterizing the connections of neural networks using sums of direct products. We show that low rank parameterizations of weight matrices are a subset of this set, and explore the theoretical and practical benefits of representing weight matrices using sums of Kronecker products. ASR results on a 50 hr subset of the English Broadcast News corpus indicate that the approach is promising. In particular, we show that a factorial network with more than 150 times less parameters in its bottom layer than its standard unconstrained counterpart suffers minimal WER degradation, and that by using sums of Kronecker products, we can close the gap in WER performance while maintaining very significant parameter savings. In addition, direct product DBNs consistently outperform standard DBNs with the same number of parameters. These results have important implications for research on deep belief networks (DBNs). They imply that we should be able to train neural networks with thousands of neurons and minimal restrictions much more rapidly than is currently possible, and that by using sums of direct products, it will be possible to train neural networks with literally millions of neurons tractably-an exciting prospect.

Keywords

belief networks; matrix decomposition; neural nets; speech recognition; Kronecker product; WER degradation; automatic speech recognition; direct product DBN; direct product based deep belief network; factorial network; low rank parameterization; neural network; weight matrices; Acoustics; Biological neural networks; Bismuth; Matrix decomposition; Neurons; Training; Vectors; Back-Propagation; Deep Belief Networks; Kronecker Product; Matrix Factorization; Multi-Layer Perceptron;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location

Vancouver, BC

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2013.6638238

Filename

6638238