Title :
Acquiring linguistic argument structure from multimodal input using attentive focus
Author :
Satish, G. ; Mukerjee, Amitabha
Author_Institution :
Comput. Sci. & Eng., Indian Inst. of Technol. Kanpur, Kanpur
Abstract :
This work is premised on three assumptions: that the semantics of certain actions may be learned prior to language, that objects in attentive focus are likely to indicate the arguments participating in that action, and that knowing such arguments helps align linguistic attention on the relevant predicate (verb). Using a computational model of dynamic attention, we present an algorithm that clusters visual events into action classes in an unsupervised manner using the Merge Neural Gas algorithm. With few clusters, the model correlates to coarse concepts such as come-closer, but with a finer granularity, it reveals hierarchical substructure such as come-closer-one-object-static and come-closer-both-moving. That the argument ordering is non-commutative is discovered for actions such as chase or come-closer-one-object-static. Knowing the arguments, and given that noun-referent mappings that are easily learned, language learning can now be constrained by considering only linguistic expressions and actions that refer to the objects in perceptual focus. We learn action schemas for linguistic units like ldquomoving towardsrdquo or ldquochaserdquo, and validate our results by producing output commentaries for 3D video.
Keywords :
linguistics; neural nets; 3D video; Merge Neural Gas algorithm; dynamic attention; language; linguistic argument structure; multimodal input using attentive focus; semantics; Clustering algorithms; Computational modeling; Computer science; Feature extraction; Focusing; Head; Layout; Object recognition; Production; Psychology;
Conference_Titel :
Development and Learning, 2008. ICDL 2008. 7th IEEE International Conference on
Conference_Location :
Monterey, CA
Print_ISBN :
978-1-4244-2661-4
Electronic_ISBN :
978-1-4244-2662-1
DOI :
10.1109/DEVLRN.2008.4640803