• DocumentCode
    3484673
  • Title

    A novel bottleneck-BLSTM front-end for feature-level context modeling in conversational speech recognition

  • Author

    Wöllmer, Martin ; Schuller, Björn ; Rigoll, Gerhard

  • Author_Institution
    Inst. for Human-Machine Commun., Tech. Univ. Munchen, Munich, Germany
  • fYear
    2011
  • fDate
    11-15 Dec. 2011
  • Firstpage
    36
  • Lastpage
    41
  • Abstract
    We present a novel automatic speech recognition (ASR) front-end that unites Long Short-Term Memory context modeling, bidirectional speech processing, and bottleneck (BN) networks for enhanced Tandem speech feature generation. Bidirectional Long Short-Term Memory (BLSTM) networks were shown to be well suited for phoneme recognition and probabilistic feature extraction since they efficiently incorporate a flexible amount of long-range temporal context, leading to better ASR results than conventional recurrent networks or multi-layer perceptrons. Combining BLSTM modeling and bottleneck feature generation allows us to produce feature vectors of arbitrary size, independent of the network training targets. Experiments on the COSINE and the Buckeye corpora containing spontaneous, conversational speech show that the proposed BN-BLSTM front-end leads to better ASR accuracies than previously proposed BLSTM-based Tandem and multi-stream systems.
  • Keywords
    feature extraction; multilayer perceptrons; probability; speech recognition; BLSTM front-end; Buckeye corpora; COSINE; automatic speech recognition; bidirectional long short-term memory; bidirectional speech processing; bottleneck feature generation; bottleneck networks; feature-level context modeling; multilayer perceptrons; network training targets; phoneme recognition; probabilistic feature extraction; recurrent networks; speech feature generation; Context; Feature extraction; Hidden Markov models; Logic gates; Mel frequency cepstral coefficient; Speech; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
  • Conference_Location
    Waikoloa, HI
  • Print_ISBN
    978-1-4673-0365-1
  • Electronic_ISBN
    978-1-4673-0366-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2011.6163902
  • Filename
    6163902