• DocumentCode
    1665855
  • Title

    Direct product based deep belief networks for automatic speech recognition

  • Author

    Fousek, Petr ; Rennie, Steven ; Dognin, Pierre ; Goel, Vikas

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • fYear
    2013
  • Firstpage
    3148
  • Lastpage
    3152
  • Abstract
    In this paper, we present new methods for parameterizing the connections of neural networks using sums of direct products. We show that low rank parameterizations of weight matrices are a subset of this set, and explore the theoretical and practical benefits of representing weight matrices using sums of Kronecker products. ASR results on a 50 hr subset of the English Broadcast News corpus indicate that the approach is promising. In particular, we show that a factorial network with more than 150 times less parameters in its bottom layer than its standard unconstrained counterpart suffers minimal WER degradation, and that by using sums of Kronecker products, we can close the gap in WER performance while maintaining very significant parameter savings. In addition, direct product DBNs consistently outperform standard DBNs with the same number of parameters. These results have important implications for research on deep belief networks (DBNs). They imply that we should be able to train neural networks with thousands of neurons and minimal restrictions much more rapidly than is currently possible, and that by using sums of direct products, it will be possible to train neural networks with literally millions of neurons tractably-an exciting prospect.
  • Keywords
    belief networks; matrix decomposition; neural nets; speech recognition; Kronecker product; WER degradation; automatic speech recognition; direct product DBN; direct product based deep belief network; factorial network; low rank parameterization; neural network; weight matrices; Acoustics; Biological neural networks; Bismuth; Matrix decomposition; Neurons; Training; Vectors; Back-Propagation; Deep Belief Networks; Kronecker Product; Matrix Factorization; Multi-Layer Perceptron;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6638238
  • Filename
    6638238