• DocumentCode
    155685
  • Title

    Joint audio source separation and dereverberation based on multichannel factorial hidden Markov model

  • Author

    Higuchi, Tatsuro ; Kameoka, Hirokazu

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
  • fYear
    2014
  • fDate
    21-24 Sept. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    This paper proposes a unified approach for jointly solving underdetermined source separation, audio event detection and dereverberation of convolutive mixtures. For monaural source separation, one successful approach involves applying non-negative matrix factorization (NMF) to the magnitude spectrogram of a mixture signal, interpreted as a non-negative matrix. Several attempts have recently been made to extend this approach to a multichannel case in order to utilize the spatial correlation of the multichannel inputs as an additional clue for source separation. The multichannel NMF assumes that an observed signal is a mixture of a limited number of source signals each of which has a static power spectral density scaled by a time-varying amplitude. We have previously proposed an extension of this approach, in which the variations over time of the spectral density and the total power of each source is modeled by a hidden Markov model (HMM). This has allowed us to solve source activity detection and source separation simultaneously through model parameter inference. While this method was based on an anechoic mixing model, the aim of this paper is to further extend the above approach to deal with reverberation by incorporating an echoic mixing model into the generative model of observed signals. Through an experiment of underdetermined source separation under reverberant conditions, we confirmed that the proposed method provided a 9.61 dB improvement compared with the conventional method in terms of the signal-to-interference ratio.
  • Keywords
    acoustic convolution; audio signal processing; blind source separation; echo; hidden Markov models; matrix decomposition; reverberation; signal detection; spectral analysis; time-varying channels; NMF; anechoic mixing model; audio event detection; audio source dereverberation; audio source separation; convolutive mixture dereverberation; echoic mixing model; mixture signal spectrogram; multichannel factorial hidden markov model; multichannel spatial correlation utilization; nonnegative matrix factorization; observed signal generative model; reverberation; signal-to-interference ratio; source activity detection; spectral density time variation; static power spectral density; time-varying amplitude; Arrays; Hidden Markov models; Microphones; Optimization; Source separation; Spectrogram; Time-frequency analysis; audio event detection; dereverberation; multichannel factorial hidden Markov model; non-negative matrix factorization; source separation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on
  • Conference_Location
    Reims
  • Type

    conf

  • DOI
    10.1109/MLSP.2014.6958927
  • Filename
    6958927