• DocumentCode
    672398
  • Title

    Semi-supervised bootstrapping approach for neural network feature extractor training

  • Author

    Grezl, Frantisek ; Karafiat, Martin

  • Author_Institution
    Speech@FIT & IT4I Center of Excellence, Brno Univ. of Technol., Brno, Czech Republic
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    470
  • Lastpage
    475
  • Abstract
    This paper presents bootstrapping approach for neural network training. The neural networks serve as bottle-neck feature extractor for subsequent GMM-HMM recognizer. The recognizer is also used for transcription and confidence assignment of untranscribed data. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. With this approach, it is possible to recover 40-55% of the difference between partially and fully transcribed data (3 to 5% absolute improvement over NN trained on supervised data only). Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result.
  • Keywords
    feature extraction; hidden Markov models; learning (artificial intelligence); neural nets; statistical analysis; GMM-HMM recognizer; automatically transcribed segments; bottle-neck feature extractor; confidence assignment; neural network feature extractor training; neural network training; semisupervised bootstrapping approach; supervised data; transcription; untranscribed data; Accuracy; Artificial neural networks; Feature extraction; Hidden Markov models; Labeling; Training; Training data; Semi-supervised training; bootstrapping; bottle-neck features;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707775
  • Filename
    6707775