مرکز منطقه ای اطلاع رساني علوم و فناوري - Small-footprint high-performance deep neural network-based speech recognition using split-VQ

DocumentCode :

730777

Title :

Small-footprint high-performance deep neural network-based speech recognition using split-VQ

Author :

Yongqiang Wang ; Jinyu Li ; Yifan Gong

Author_Institution :

Microsoft Corp., Redmond, WA, USA

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4984

Lastpage :

4988

Abstract :

Due to a large number of parameters in deep neural networks (DNNs), it is challenging to design a small-footprint DNN-based speech recognition system while maintaining a high recognition performance. Even with a singular value matrix decomposition (SVD) method and scalar quantization, the DNN model is still too large to be deployed on many mobile devices. Common practices like reducing the number of hidden nodes often result in significant accuracy loss. In this work, we propose to split each row vector of weight matrices into sub-vectors, and quantize them into a set of codewords using a split vector quantization (split-VQ) algorithm. The codebook can be fine-tuned using back-propagation when an aggressive quantization is performed. Experimental results demonstrate that the proposed method can further reduce the model size by 75% to 80% and save 10% to 50% computation on top of an already very compact SVD-DNN without a noticeable performance degradation. This results in a 3.2 MB-footprint DNN giving similar recognition performance as what a 59.1 MB standard DNN can achieve.

Keywords :

neural nets; quantisation (signal); singular value decomposition; speech recognition; codewords; scalar quantization; singular value matrix decomposition method; small-footprint high-performance deep neural network; speech recognition; split vector quantization algorithm; split-VQ; Accuracy; Acoustics; Hidden Markov models; Matrix decomposition; Neural networks; Quantization (signal); Speech recognition; DNN; model compression; on device speech recognition; split-VQ;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178919

Filename :

7178919

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=730777