• DocumentCode
    463680
  • Title

    Dynamic Dependency Tests for Audio-Visual Speaker Association

  • Author

    Siracusa, M.R. ; Fisher, John W.

  • Author_Institution
    Comput. Sci. & Artificial Intelligence Lab., MIT, MA, USA
  • Volume
    2
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Abstract
    We formulate the problem of audio-visual speaker association as a dynamic dependency test. That is, given an audio stream and multiple video streams, we wish to determine their dependency structure as it evolves over time. To this end, we propose the use of a hidden factorization Markov model in which the hidden state encodes a finite number of possible dependency structures. Each dependency structure has an explicit semantic meaning, namely "who is speaking". This model takes advantage of both structural and parametric changes associated with changes in speaker. This is contrasted with standard sliding window based dependence analysis. Using this model we obtain state-of-the-art performance on an audio-visual association task without benefit of training data.
  • Keywords
    Markov processes; audio signal processing; speaker recognition; video signal processing; audio stream; audio-visual speaker association; dynamic dependency tests; hidden factorization Markov model; multiple video streams; Artificial intelligence; Bayesian methods; Computer science; Context modeling; Hidden Markov models; Layout; Random variables; Streaming media; Testing; Training data; Pattern clustering methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
  • Conference_Location
    Honolulu, HI
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0727-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2007.366271
  • Filename
    4217444