DocumentCode :
960330
Title :
Short-Term Spatio–Temporal Clustering Applied to Multiple Moving Speakers
Author :
Lathoud, Guillaume ; Odobez, Jean-Marc
Author_Institution :
IDIAP Res. Inst., Martigny
Volume :
15
Issue :
5
fYear :
2007
fDate :
7/1/2007 12:00:00 AM
Firstpage :
1696
Lastpage :
1710
Abstract :
Distant microphones permit to process spontaneous multiparty speech with very little constraints on speakers, as opposed to close-talking microphones. Minimizing the constraints on speakers permits a large diversity of applications, including meeting summarization and browsing, surveillance, hearing aids, and more natural human-machine interaction. Such applications of distant microphones require to determine where and when the speakers are talking. This is inherently a multisource problem, because of background noise sources, as well as the natural tendency of multiple speakers to talk over each other. Moreover, spontaneous speech utterances are highly discontinuous, which makes it difficult to track the multiple speakers with classical filtering approaches, such as Kalman filtering of particle filters. As an alternative, this paper proposes a probabilistic framework to determine the trajectories of multiple moving speakers in the short-term only, i.e., only while they speak. Instantaneous location estimates that are close in space and time are grouped into ldquoshort-term clustersrdquo in a principled manner. Each short-term cluster determines the precise start and end times of an utterance and a short-term spatial trajectory. Contrastive experiments clearly show the benefit of using short-term clustering, on real indoor recordings with seated speakers in meetings, as well as multiple moving speakers.
Keywords :
crosstalk; probability; spatiotemporal phenomena; speaker recognition; Kalman filtering; classical filtering approach; close-talking microphone; distant microphone; human-machine interaction; multiple moving speaker; probabilistic framework; short-term cluster; spatial trajectory; spatio-temporal clustering; spontaneous speech utterance; Background noise; Filtering; Hearing aids; Kalman filters; Man machine systems; Microphones; Particle filters; Particle tracking; Speech processing; Surveillance; Localization; multiple acoustic sources; short-term clustering; speech segmentation; tracking;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2007.896667
Filename :
4244525
Link To Document :
بازگشت