Title :
Semi-supervised bootstrapping approach for neural network feature extractor training
Author :
Grezl, Frantisek ; Karafiat, Martin
Author_Institution :
Speech@FIT & IT4I Center of Excellence, Brno Univ. of Technol., Brno, Czech Republic
Abstract :
This paper presents bootstrapping approach for neural network training. The neural networks serve as bottle-neck feature extractor for subsequent GMM-HMM recognizer. The recognizer is also used for transcription and confidence assignment of untranscribed data. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. With this approach, it is possible to recover 40-55% of the difference between partially and fully transcribed data (3 to 5% absolute improvement over NN trained on supervised data only). Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result.
Keywords :
feature extraction; hidden Markov models; learning (artificial intelligence); neural nets; statistical analysis; GMM-HMM recognizer; automatically transcribed segments; bottle-neck feature extractor; confidence assignment; neural network feature extractor training; neural network training; semisupervised bootstrapping approach; supervised data; transcription; untranscribed data; Accuracy; Artificial neural networks; Feature extraction; Hidden Markov models; Labeling; Training; Training data; Semi-supervised training; bootstrapping; bottle-neck features;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location :
Olomouc
DOI :
10.1109/ASRU.2013.6707775