• DocumentCode
    3530287
  • Title

    A speech fragment approach to localising multiple speakers in reverberant environments

  • Author

    Christensen, Heidi ; Ma, Ning ; Wrigley, Stuart N. ; Barker, Jon

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Sheffield, Sheffield
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4593
  • Lastpage
    4596
  • Abstract
    Sound source localisation cues are severely degraded when multiple acoustic sources are active in the presence of reverberation. We present a binaural system for localising simultaneous speakers which exploits the fact that in a speech mixture there exist spectro-temporal regions or dasiafragmentspsila, where the energy is dominated by just one of the speakers. A fragment-level localisation model is proposed that integrates the localisation cues within a fragment using a weighted mean. The weights are based on local estimates of the degree of reverberation in a given spectro-temporal cell. The paper investigates different weight estimation approaches based variously on, i) an established model of the perceptual precedence effect; ii) a measure of interaural coherence between the left and right ear signals; iii) a data-driven approach trained in matched acoustic conditions. Experiments with reverberant binaural data with two simultaneous speakers show appropriate weighting can improve frame-based localisation performance by up to 24%.
  • Keywords
    speech processing; binaural system; data-driven approach; ear signals; fragment-level localisation model; interaural coherence; localisation cues; multiple acoustic sources; multiple speakers; perceptual precedence effect; reverberant environment; sound source localisation; spectrotemporal cell; spectrotemporal regions; speech fragment approach; weight estimation; weighted mean; Acoustic measurements; Coherence; Computer science; Degradation; Ear; Humans; Loudspeakers; Reverberation; Robustness; Speech; Binaural Localisation; Multi-source; Reverberation; Spectro-Temporal Processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960653
  • Filename
    4960653