مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning cross-modal appearance models with application to tracking

DocumentCode :

1878073

Title :

Learning cross-modal appearance models with application to tracking

Author :

Fisher, John W., III ; Darrell, Trevor

Author_Institution :

Artificial Intelligence Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA

Volume :

fYear :

2003

fDate :

6-9 July 2003

Abstract :

Objects of interest are rarely silent or invisible. Analysis of multi-modal signal generation from a single object represents a rich and challenging area for smart sensor arrays. We consider the problem of simultaneously learning and audio and visual appearance model of a moving subject. We present a method which successfully learns such a model without benefit of hand initialization using only the associated audio signal to "decide" which object to model and track. We are interested in particular in modeling joint audio and video variation, such as produced by a speaking face. We present an algorithm and experimental results of a human speaker moving in a scene.

Keywords :

image motion analysis; intelligent sensors; speaker recognition; audio appearance model; crossmodal appearance models; multi-modal signal generation; smart sensor arrays; speaking face; tracking; visual appearance model; Artificial intelligence; Intelligent sensors; Laboratories; Layout; Learning; Principal component analysis; Robustness; Sensor arrays; Signal analysis; Signal generators;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on

Print_ISBN :

0-7803-7965-9

Type :

conf

DOI :

10.1109/ICME.2003.1221541

Filename :

1221541

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1878073