• DocumentCode
    1044582
  • Title

    Two-Microphone Separation of Speech Mixtures

  • Author

    Pedersen, Michael Syskind ; Wang, DeLiang ; Larsen, Jan ; Kjems, Ulrik

  • Volume
    19
  • Issue
    3
  • fYear
    2008
  • fDate
    3/1/2008 12:00:00 AM
  • Firstpage
    475
  • Lastpage
    492
  • Abstract
    Separation of speech mixtures, often referred to as the cocktail party problem, has been studied for decades. In many source separation tasks, the separation method is limited by the assumption of at least as many sensors as sources. Further, many methods require that the number of signals within the recorded mixtures be known in advance. In many real-world applications, these limitations are too restrictive. We propose a novel method for underdetermined blind source separation using an instantaneous mixing model which assumes closely spaced microphones. Two source separation techniques have been combined, independent component analysis (ICA) and binary time-frequency (T-F) masking. By estimating binary masks from the outputs of an ICA algorithm, it is possible in an iterative way to extract basis speech signals from a convolutive mixture. The basis signals are afterwards improved by grouping similar signals. Using two microphones, we can separate, in principle, an arbitrary number of mixed speech signals. We show separation results for mixtures with as many as seven speech signals under instantaneous conditions. We also show that the proposed method is applicable to segregate speech signals under reverberant conditions, and we compare our proposed method to another state-of-the-art algorithm. The number of source signals is not assumed to be known in advance and it is possible to maintain the extracted signals as stereo signals.
  • Keywords
    blind source separation; independent component analysis; microphones; speech processing; binary time-frequency masking; blind source separation; cocktail party problem; independent component analysis; speech mixture separation; speech signal extraction; two-microphone separation; Ideal binary mask; independent component analysis (ICA); time–frequency (T–F) masking; underdetermined speech separation; Algorithms; Humans; Neural Networks (Computer); Principal Component Analysis; Signal Processing, Computer-Assisted; Sound;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/TNN.2007.911740
  • Filename
    4436182