Title :
Optimizing deep bottleneck feature extraction
Author :
Quoc Bao Nguyen ; Gehring, Jonas ; Kilgour, Kevin ; Waibel, Alex
Author_Institution :
Int. Center for Adv. Commun. Technol. - InterACT, Karlsruhe Inst. of Technol., Karlsruhe, Germany
Abstract :
We investigate several optimizations to a recently published architecture for extracting bottleneck features for large-vocabulary speech recognition with deep neural networks. We are able to improve recognition performance of first-pass systems from a 12% relative word error rate reduction reported previously to 21%, compared to MFCC baselines on a Tagalog conversational telephone speech corpus. This is achieved by using different input features, training the network to predict context-dependent targets, employing an efficient learning rate schedule and varying several architectural details. Evaluations on two larger German and French speech transcription tasks show that the optimizations proposed are universally applicable and yield comparable gains on other corpora (19.9% and 22.8%, respectively).
Keywords :
feature extraction; learning (artificial intelligence); natural language processing; neural nets; optimisation; speech recognition; French speech transcription tasks; German speech transcription tasks; Tagalog conversational telephone speech corpus; context-dependent target prediction; deep bottleneck feature extraction optimization; deep neural networks; first-pass systems; input features; large-vocabulary speech recognition; learning rate schedule; network training; recognition performance improvement; relative word error rate reduction; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Neural networks; Optimization; Speech; Training;
Conference_Titel :
Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2013 IEEE RIVF International Conference on
Conference_Location :
Hanoi
Print_ISBN :
978-1-4799-1349-7
DOI :
10.1109/RIVF.2013.6719885