• DocumentCode
    3744906
  • Title

    Cambridge university transcription systems for the multi-genre broadcast challenge

  • Author

    P. C. Woodland;X. Liu;Y. Qian;C. Zhang;M. J. F. Gales;P. Karanasou;P. Lanchantin;L. Wang

  • Author_Institution
    Cambridge University Engineering Dept, Trumpington St., Cambridge, CB2 1PZ U.K.
  • fYear
    2015
  • Firstpage
    639
  • Lastpage
    646
  • Abstract
    We describe the development of our speech-to-text transcription systems for the 2015 Multi-Genre Broadcast (MGB) challenge. Key features of the systems are: a segmentation system based on deep neural networks (DNNs); the use of HTK 3.5 for building DNN-based hybrid and tandem acoustic models and the use of these models in a joint decoding framework; techniques for adaptation of DNN based acoustic models including parameterised activation function adaptation; alternative acoustic models built using Kaldi; and recurrent neural network language models (RNNLMs) and RNNLM adaptation. The same language models were used with both HTK and Kaldi acoustic models and various combined systems built. The final systems had the lowest error rates on the evaluation data.
  • Keywords
    "Acoustics","Training","Adaptation models","Manuals","Data models","Decoding","Silicon"
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
  • Type

    conf

  • DOI
    10.1109/ASRU.2015.7404856
  • Filename
    7404856