مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning spoken words from multisensory input

DocumentCode :

3179626

Title :

Learning spoken words from multisensory input

Author :

Yu, Chen ; Ballard, Dana H.

Author_Institution :

Dept. of Comput. Sci., Rochester Univ., NY, USA

Volume :

fYear :

2002

fDate :

26-30 Aug. 2002

Firstpage :

998

Abstract :

Speech recognition and speech translation are traditionally addressed by processing acoustic signals while nonlinguistic information is typically not used. We present a new method which explores the spoken word learning from naturally co-occurring multisensory information in a dyadic (two-person) conversation. It has been noticed that the listener always has a strong tendency to look toward objects referred to by the speaker during the conversation. In light of this, we propose to use eye gaze to integrate acoustic and visual signals, and build the audio-visual lexicons of objects. With such data gathered from conversations in different languages, the spoken names of objects in different languages can be translated based on their visual semantics. We have developed a multimodal learning system and report the results of experiments using speech, video in concert with eye movement records as training data.

Keywords :

acoustic signal processing; eye; language translation; natural languages; speech recognition; video signal processing; acoustic signal processing; acoustic signals; audio-visual lexicons; baking data; dyadic two-person conversation; eye movement records; multimodal learning system; multisensory information; multisensory input; nonlinguistic information; speech recognition; speech translation; spoken word learning; video processing; visual semantics; visual signals; Authentication; Computer science; Humans; Learning systems; Loudspeakers; Natural languages; Pediatrics; Signal processing; Speech processing; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing, 2002 6th International Conference on

Print_ISBN :

0-7803-7488-6

Type :

conf

DOI :

10.1109/ICOSP.2002.1179956

Filename :

1179956

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3179626