• DocumentCode
    2575472
  • Title

    Factorial Scaled Hidden Markov Model for polyphonic audio representation and source separation

  • Author

    Ozerov, Alexey ; Févotte, Cédric ; Charbit, Maurice

  • fYear
    2009
  • fDate
    18-21 Oct. 2009
  • Firstpage
    121
  • Lastpage
    124
  • Abstract
    We present a new probabilistic model for polyphonic audio termed factorial scaled hidden Markov model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito nonnegative matrix factorization (NMF) model. We describe two expectation-maximization (EM) algorithms for maximum likelihood estimation, which differ by the choice of complete data set. The second EM algorithm, based on a reduced complete data set and multiplicative updates inspired from NMF methodology, exhibits much faster convergence. We consider the FS-HMM in different configurations for the difficult problem of speech/music separation from a single channel and report satisfying results.
  • Keywords
    audio signal processing; expectation-maximisation algorithm; hidden Markov models; signal representation; source separation; Gaussian scaled mixture model; Itakura-Saito nonnegative matrix factorization model; expectation-maximization algorithm; factorial scaled hidden Markov model; maximum likelihood estimation; polyphonic audio; polyphonic audio representation; source separation; speech-music separation; Acoustic signal processing; Conferences; Convergence; Hidden Markov models; Maximum likelihood estimation; Music information retrieval; Signal processing algorithms; Source separation; Speech; Telecommunications; Factorial hidden Markov model; Gaussian scaled mixture models; audio source separation; expectation-maximization algorithm; nonnegative matrix factorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
  • Conference_Location
    New Paltz, NY
  • ISSN
    1931-1168
  • Print_ISBN
    978-1-4244-3678-1
  • Electronic_ISBN
    1931-1168
  • Type

    conf

  • DOI
    10.1109/ASPAA.2009.5346527
  • Filename
    5346527