• DocumentCode
    1766529
  • Title

    Acoustic Modeling With Hierarchical Reservoirs

  • Author

    Triefenbach, Fabian ; Jalalvand, Azarakhsh ; Demuynck, Kris ; Martens, Jean-Pierre

  • Author_Institution
    ELIS Multimedia Lab., Ghent Univ., Ghent, Belgium
  • Volume
    21
  • Issue
    11
  • fYear
    2013
  • fDate
    Nov. 2013
  • Firstpage
    2439
  • Lastpage
    2450
  • Abstract
    Accurate acoustic modeling is an essential requirement of a state-of-the-art continuous speech recognizer. The Acoustic Model (AM) describes the relation between the observed speech signal and the non-observable sequence of phonetic units uttered by the speaker. Nowadays, most recognizers use Hidden Markov Models (HMMs) in combination with Gaussian Mixture Models (GMMs) to model the acoustics, but neural-based architectures are on the rise again. In this work, the recently introduced Reservoir Computing (RC) paradigm is used for acoustic modeling. A reservoir is a fixed - and thus non-trained - Recurrent Neural Network (RNN) that is combined with a trained linear model. This approach combines the ability of an RNN to model the recent past of the input sequence with a simple and reliable training procedure. It is shown here that simple reservoir-based AMs achieve reasonable phone recognition and that deep hierarchical and bi-directional reservoir architectures lead to a very competitive Phone Error Rate (PER) of 23.1% on the well-known TIMIT task.
  • Keywords
    Gaussian processes; hidden Markov models; recurrent neural nets; speech recognition; GMM; Gaussian mixture model; HMM; RC paradigm; TIMIT task; acoustic modeling; bi-directional reservoir architecture; continuous speech recognizer; fixed recurrent neural network; hidden Markov model; hierarchical reservoir architecture; hierarchical reservoirs; neural-based architectures; nonobservable sequence; nontrained RNN; observed speech signal; phone error rate; phone recognition; phonetic units; reservoir computing paradigm; reservoir-based AM; trained linear model; training procedure; Acoustics; Computational modeling; Hidden Markov models; Neurons; Reservoirs; Speech; Training; Acoustic modeling; automatic speech recognition; recurrent neural networks; reservoir computing;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2013.2280209
  • Filename
    6587732