• DocumentCode
    2200267
  • Title

    Dynamic Bayesian network based speech recognition with pitch and energy as auxiliary variables

  • Author

    Stephenson, Todd A. ; Escofet, Jaume ; Magimai-Doss, Mathew ; Bourlard, Heré

  • Author_Institution
    Dalle Molle Inst. for Perceptual Artificial Intelligence, Martigny, Switzerland
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    637
  • Lastpage
    646
  • Abstract
    Pitch and energy are two fundamental features describing speech, having importance in human speech recognition. However, when incorporated as features in automatic speech recognition (ASR), they usually result in a significant degradation on recognition performance due to the noise inherent in estimating or modeling them. We show experimentally how this can be corrected by either conditioning the emission distributions upon these features or by marginalizing out these features in recognition. Since to do this is not obvious with standard hidden Markov models (HMMs), this work has been performed in the framework of dynamic Bayesian networks (DBNs), resulting in more flexibility in defining the topology of the emission distributions and in specifying whether variables should be marginalized out.
  • Keywords
    belief networks; feature extraction; learning (artificial intelligence); parameter estimation; random noise; speech recognition; HMM; acoustic feature estimation; automatic speech recognition; dynamic Bayesian networks; emission distributions; energy; hidden Markov models; pitch; training data; Acoustic emission; Artificial intelligence; Automatic speech recognition; Bayesian methods; Degradation; Hidden Markov models; Humans; Network topology; Speech enhancement; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks for Signal Processing, 2002. Proceedings of the 2002 12th IEEE Workshop on
  • Print_ISBN
    0-7803-7616-1
  • Type

    conf

  • DOI
    10.1109/NNSP.2002.1030075
  • Filename
    1030075