DocumentCode :
1652339
Title :
Learning audio-visual associations using mutual information
Author :
Roy, Deb ; Schiele, Bernt ; Pentland, Alex
Author_Institution :
MIT, Cambridge, MA, USA
fYear :
1999
fDate :
6/21/1905 12:00:00 AM
Firstpage :
147
Lastpage :
163
Abstract :
This paper addresses the problem of finding useful associations between audio and visual input signals. The proposed approach is based on the maximization of mutual information of audio-visual clusters. This approach results in segmentation of continuous speech signals, and finds visual categories which correspond to segmented spoken words. Such audio-visual associations may be used for modeling infant language acquisition and to dynamically personalize speech-based human-computer interfaces for various applications including catalog browsing and wearable computing. This paper describes an implemented system for learning shape names from camera and microphone input. We present results in an evaluation of the system for the domain of modeling language learning
Keywords :
image segmentation; optimisation; speech-based user interfaces; audio-visual associations learning; catalog browsing; continuous speech signals; image segmentation; infant language acquisition; language learning; maximization; mutual information; speech-based human-computer interfaces; visual categories; wearable computing; Cameras; Clothing; Electronic switching systems; Laboratories; Mutual information; Natural languages; Shape; Speech; Streaming media; Wearable computers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Integration of Speech and Image Understanding, 1999. Proceedings
Conference_Location :
Corfu
Print_ISBN :
0-7695-0471-X
Type :
conf
DOI :
10.1109/ISIU.1999.824909
Filename :
824909
Link To Document :
بازگشت