• DocumentCode
    41711
  • Title

    Model-Based Multiple Pitch Tracking Using Factorial HMMs: Model Adaptation and Inference

  • Author

    Wohlmayr, M. ; Pernkopf, Franz

  • Author_Institution
    Signal Process. & Speech Commun. Lab. (SPSC), Graz Univ. of Technol., Graz, Austria
  • Volume
    21
  • Issue
    8
  • fYear
    2013
  • fDate
    Aug. 2013
  • Firstpage
    1742
  • Lastpage
    1754
  • Abstract
    Robustness against noise and interfering audio signals is one of the challenges in speech recognition and audio analysis technology. One avenue to approach this challenge is single-channel multiple-source modeling. Factorial hidden Markov models (FHMMs) are capable of modeling acoustic scenes with multiple sources interacting over time. While these models reach good performance on specific tasks, there are still serious limitations restricting the applicability in many domains. In this paper, we generalize these models and enhance their applicability. In particular, we develop an EM-like iterative adaptation framework which is capable to adapt the model parameters to the specific situation (e.g. actual speakers, gain, acoustic channel, etc.) using only speech mixture data. Currently, source-specific data is required to learn the model. Inference in FHMMs is an essential ingredient for adaptation. We develop efficient approaches based on observation likelihood pruning. Both adaptation and efficient inference are empirically evaluated for the task of multipitch tracking using the GRID corpus.
  • Keywords
    audio signal processing; hidden Markov models; iterative methods; speech recognition; EM-like iterative adaptation framework; GRID corpus; acoustic scene modeling; audio analysis technology; audio signals; factorial HMM; factorial hidden Markov models; model adaptation; model inference; model-based multiple pitch tracking; multipitch tracking task; observation likelihood pruning; single-channel multiple-source modeling; source-specific data; speech recognition; Efficient inference; Gaussian mixture model; factorial hidden Markov model; mixture maximization; model adaptation; multipitch tracking; self-adaptation;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2013.2260744
  • Filename
    6510492