DocumentCode :
3530287
Title :
A speech fragment approach to localising multiple speakers in reverberant environments
Author :
Christensen, Heidi ; Ma, Ning ; Wrigley, Stuart N. ; Barker, Jon
Author_Institution :
Dept. of Comput. Sci., Univ. of Sheffield, Sheffield
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
4593
Lastpage :
4596
Abstract :
Sound source localisation cues are severely degraded when multiple acoustic sources are active in the presence of reverberation. We present a binaural system for localising simultaneous speakers which exploits the fact that in a speech mixture there exist spectro-temporal regions or dasiafragmentspsila, where the energy is dominated by just one of the speakers. A fragment-level localisation model is proposed that integrates the localisation cues within a fragment using a weighted mean. The weights are based on local estimates of the degree of reverberation in a given spectro-temporal cell. The paper investigates different weight estimation approaches based variously on, i) an established model of the perceptual precedence effect; ii) a measure of interaural coherence between the left and right ear signals; iii) a data-driven approach trained in matched acoustic conditions. Experiments with reverberant binaural data with two simultaneous speakers show appropriate weighting can improve frame-based localisation performance by up to 24%.
Keywords :
speech processing; binaural system; data-driven approach; ear signals; fragment-level localisation model; interaural coherence; localisation cues; multiple acoustic sources; multiple speakers; perceptual precedence effect; reverberant environment; sound source localisation; spectrotemporal cell; spectrotemporal regions; speech fragment approach; weight estimation; weighted mean; Acoustic measurements; Coherence; Computer science; Degradation; Ear; Humans; Loudspeakers; Reverberation; Robustness; Speech; Binaural Localisation; Multi-source; Reverberation; Spectro-Temporal Processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960653
Filename :
4960653
Link To Document :
بازگشت