DocumentCode
3000285
Title
Optimizing deep bottleneck feature extraction
Author
Quoc Bao Nguyen ; Gehring, Jonas ; Kilgour, Kevin ; Waibel, Alex
Author_Institution
Int. Center for Adv. Commun. Technol. - InterACT, Karlsruhe Inst. of Technol., Karlsruhe, Germany
fYear
2013
fDate
10-13 Nov. 2013
Firstpage
152
Lastpage
156
Abstract
We investigate several optimizations to a recently published architecture for extracting bottleneck features for large-vocabulary speech recognition with deep neural networks. We are able to improve recognition performance of first-pass systems from a 12% relative word error rate reduction reported previously to 21%, compared to MFCC baselines on a Tagalog conversational telephone speech corpus. This is achieved by using different input features, training the network to predict context-dependent targets, employing an efficient learning rate schedule and varying several architectural details. Evaluations on two larger German and French speech transcription tasks show that the optimizations proposed are universally applicable and yield comparable gains on other corpora (19.9% and 22.8%, respectively).
Keywords
feature extraction; learning (artificial intelligence); natural language processing; neural nets; optimisation; speech recognition; French speech transcription tasks; German speech transcription tasks; Tagalog conversational telephone speech corpus; context-dependent target prediction; deep bottleneck feature extraction optimization; deep neural networks; first-pass systems; input features; large-vocabulary speech recognition; learning rate schedule; network training; recognition performance improvement; relative word error rate reduction; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Neural networks; Optimization; Speech; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2013 IEEE RIVF International Conference on
Conference_Location
Hanoi
Print_ISBN
978-1-4799-1349-7
Type
conf
DOI
10.1109/RIVF.2013.6719885
Filename
6719885
Link To Document