مرکز منطقه ای اطلاع رساني علوم و فناوري - Extracting deep bottleneck features using stacked auto-encoders

DocumentCode :

1667141

Title :

Extracting deep bottleneck features using stacked auto-encoders

Author :

Gehring, Jonas ; Miao, Yinping ; Metze, Florian ; Waibel, Alex

Author_Institution :

Interactive Syst. Lab., Karlsruhe Inst. of Technol., Karlsruhe, Germany

fYear :

2013

Firstpage :

3377

Lastpage :

3381

Abstract :

In this work, a novel training scheme for generating bottleneck features from deep neural networks is proposed. A stack of denoising auto-encoders is first trained in a layer-wise, unsupervised manner. Afterwards, the bottleneck layer and an additional layer are added and the whole network is fine-tuned to predict target phoneme states. We perform experiments on a Cantonese conversational telephone speech corpus and find that increasing the number of auto-encoders in the network produces more useful features, but requires pre-training, especially when little training data is available. Using more unlabeled data for pre-training only yields additional gains. Evaluations on larger datasets and on different system setups demonstrate the general applicability of our approach. In terms of word error rate, relative improvements of 9.2% (Cantonese, ML training), 9.3% (Tagalog, BMMI-SAT training), 12% (Tagalog, confusion network combinations with MFCCs), and 8.7% (Switchboard) are achieved.

Keywords :

neural nets; speech processing; BMMI-SAT training; Cantonese conversational telephone speech corpus; MFCC; Switchboard; deep bottleneck feature; deep neural network; phoneme states; stacked autoencoder; training scheme; word error rate; Acoustics; Feature extraction; Hidden Markov models; Neural networks; Speech; Training; Vectors; Auto-encoders; Bottleneck features; Deep learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6638284

Filename :

6638284

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1667141