• DocumentCode
    3349285
  • Title

    Multiple person and speaker activity tracking with a particle filter

  • Author

    Checka, Neal ; Wilson, Kevin W. ; Siracusa, Michael R. ; Darrell, Trevor

  • Author_Institution
    Artificial Intelligence Lab., MIT, Cambridge, MA, USA
  • Volume
    5
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    In this paper, we present a system that combines sound and vision to track multiple people. In a cluttered or noisy scene, multi-person tracking estimates have a distinctly non-Gaussian distribution. We apply a particle filter with audio and video state components, and derive observation likelihood methods based on both audio and video measurements. Our state includes the number of people present, their positions, and whether each person is talking. We show experiments in an environment with sparse microphones and monocular cameras. Our results show that our system can accurately track the locations and speech activity of a varying number of people.
  • Keywords
    Monte Carlo methods; audio signal processing; optical tracking; position measurement; video signal processing; audio state components; audio-visual state space model; cluttered scene; monocular cameras; multi-modal tracking architecture; multiple person activity tracking; multiple speaker activity tracking; noisy scene; nonGaussian distribution; observation likelihood methods; particle filter; people number; person position; sparse microphones; speech activity; talking person; video state components; Acoustic noise; Cameras; Filtering; Layout; Loudspeakers; Microphone arrays; Particle filters; Particle measurements; Particle tracking; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1327252
  • Filename
    1327252