• DocumentCode
    2575464
  • Title

    Towards co-channel speaker separation BY 2-D demodulation of spectrograms

  • Author

    Wang, Tianyu T. ; Quatieri, Thomas F.

  • fYear
    2009
  • fDate
    18-21 Oct. 2009
  • Firstpage
    65
  • Lastpage
    68
  • Abstract
    This paper explores a two-dimensional (2-D) processing approach for co-channel speaker separation of voiced speech. We analyze localized time-frequency regions of a narrowband spectrogram using 2-D Fourier transforms and propose a 2-D amplitude modulation model based on pitch information for single and multi-speaker content in each region. Our model maps harmonically-related speech content to concentrated entities in a transformed 2-D space, thereby motivating 2-D demodulation of the spectrogram for analysis/synthesis and speaker separation. Using a priori pitch estimates of individual speakers, we show through a quantitative evaluation: 1) Utility of the model for representing speech content of a single speaker and 2) Its feasibility for speaker separation. For the separation task, we also illustrate benefits of the model´s representation of pitch dynamics relative to a sinusoidal-based separation system.
  • Keywords
    Fourier transforms; amplitude modulation; source separation; speech synthesis; 2D Fourier transforms; 2D amplitude modulation model; 2D demodulation; a priori pitch estimation; cochannel speaker separation; narrowband spectrogram; sinusoidal-based separation system; speech content representation; two-dimensional processing approach; Demodulation; Fourier transforms; Information analysis; Narrowband; Spectrogram; Speech analysis; Speech processing; Speech synthesis; Time frequency analysis; Two dimensional displays; 2-D speech analysis; Grating Compression Transform; speaker separation; spectrogram demodulation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
  • Conference_Location
    New Paltz, NY
  • ISSN
    1931-1168
  • Print_ISBN
    978-1-4244-3678-1
  • Electronic_ISBN
    1931-1168
  • Type

    conf

  • DOI
    10.1109/ASPAA.2009.5346526
  • Filename
    5346526