مرکز منطقه ای اطلاع رساني علوم و فناوري - Informative subspaces for audio-visual processing: High-level function from low-level fusion

DocumentCode :

2882050

Title :

Informative subspaces for audio-visual processing: High-level function from low-level fusion

Author :

Fisher, John W., III ; Darrell, Trevor

Author_Institution :

Massachusetts Institute of Technology, Cambridge, 02139, USA

Volume :

fYear :

2002

fDate :

13-17 May 2002

Abstract :

We propose a new probabilistic model of single source multi-modal generation, and show algorithms for maximizing mutual information which find correspondences between signal components. We show a nonparametric method for finding informative subspaces that captures complex statistical relationships between different modalities. We extend a previous subspace method to include new priors on the projection weights, yielding more robust results. Applied to human speakers, our model finds a relationship between audio speech and video of facial motion, and partially segments background events in both channels. We present new results on the problem of audio-visual verification, and show how the audio and video of a speaker can be matched without a prior model of the speaker´s voice or appearance.

Keywords :

Artificial neural networks; Gold; Pixel;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location :

Orlando, FL, USA

ISSN :

1520-6149

Print_ISBN :

0-7803-7402-9

Type :

conf

DOI :

10.1109/ICASSP.2002.5745560

Filename :

5745560

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2882050