• DocumentCode
    310462
  • Title

    Advanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems

  • Author

    Neukirchen, Christoph ; Rigoll, Gerhard

  • Author_Institution
    Dept. of Comput. Sci., Gerhard-Mercator-Univ., Duisburg, Germany
  • Volume
    4
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    3257
  • Abstract
    This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classification in the continuous classifier framework is given and some constraints are derived that must hold for the PDFs in the discrete pattern classifier context. Furthermore it is shown that for ML training of the whole system the VQ parameters must be estimated according to the maximum mutual information (MMI) criterion. A novel training method based on gradient search for neural networks that serve as optimal VQ is derived. This allows faster training of arbitrary network topologies compared to the traditional MMI-NN training. An integration of multilayer MMI-NNs as the VQ in the hybrid discrete HMM based speech recognizer leads to a large improvement compared to other supervised and unsupervised single layer VQ systems. For the speaker independent Resource Management database the constructed hybrid MMI-connectionist/HMM system achieves recognition rates that are comparable to traditional sophisticated continuous PDF HMM systems
  • Keywords
    hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; network topology; neural nets; pattern classification; search problems; speech processing; speech recognition; vector quantisation; HMM speech recognition systems; ML training; PDF; VQ; continuous classifier; discrete HMM; discrete pattern classifier; gradient search; hybrid MMI-connectionist/HMM system; maximum mutual information; multilayer MMI-NN; network topologies; neural vector quantizer; parameter estimation; recognition rates; speaker independent Resource Management database; training methods; Hidden Markov models; Management training; Maximum likelihood estimation; Multi-layer neural network; Mutual information; Network topology; Neural networks; Parameter estimation; Resource management; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.595488
  • Filename
    595488