DocumentCode :
179605
Title :
Lattice based optimization of bottleneck feature extractor with linear transformation
Author :
Diyuan Liu ; Si Wei ; Wu Guo ; Yebo Bao ; Shifu Xiong ; Lirong Dai
Author_Institution :
Nat. Eng. Lab. for Speech & Language Inf. Process., Univ. of Sci. & Technol. of China, Hefei, China
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
5617
Lastpage :
5621
Abstract :
This paper proposes a lattice-based sequential discriminative training method to extract more discriminative bottleneck features. In our method, the bottleneck neural network is first trained with cross entropy criteria, and then only the weights of bottleneck layer are retrained with sequential criteria. If the outputs of the layer before bottleneck are treated as the raw features, the new method is an equivalent to a linear feature transformation algorithm. This linearity makes the optimization much easier than updating the whole neural network. Just like the fMPE and RDLT, the neural network is retrained with batch mode gradient descent, making the training to be easily implemented in parallel. Meanwhile, batch mode optimization can naturally deal with the indirect gradient to make the optimization more precise. Experimental results on a Mandarin transcription task and the Switchboard task have shown the effectiveness of the proposed method with the CER decreases from 12.2% to 11.3% and the WER from 16.1% to 15.0%, respectively.
Keywords :
error statistics; feature extraction; gradient methods; lattice theory; natural language processing; neural nets; optimisation; speech recognition; CER; Mandarin transcription task; RDLT; WER; batch mode gradient descent; batch mode optimization; bottleneck feature extractor; bottleneck layer; bottleneck neural network; character error rate; cross entropy criteria; fMPE; indirect gradient; lattice based optimization; lattice-based sequential discriminative training method; linear feature transformation algorithm; minimum phone error; raw features; sequential criteria; switchboard task; word error rate; Acoustics; Feature extraction; Hidden Markov models; Neural networks; Speech; Speech recognition; Training; bottleneck features; discriminative training; neural networks; sequence training; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6854678
Filename :
6854678
Link To Document :
بازگشت