Grounded perceptual schemas: developmental acquisition of spatial concepts

Authors:
Amitabha Mukerjee;Mausoom Sarkar
Affiliations:
Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India;Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India
Venue:
SC'06 Proceedings of the 2006 international conference on Spatial Cognition V: reasoning, action, interaction
Year:
2006

Citing 8
Cited 0

Neat versus scruffy: a review of computational models for spatial expressions

Representation and processing of spatial expressions
Toward the simulation of spatial mental images using the Voronoi¨ model

Representation and processing of spatial expressions
A New Way to Represent the Relative Position between Areal Objects

IEEE Transactions on Pattern Analysis and Machine Intelligence
Function, geometry and spatial prepositions: Threeexperiments

Spatial Cognition and Computation
Models of bottom-up and top-down visual attention

Models of bottom-up and top-down visual attention
Qualitative Spatial Representation and Reasoning: An Overview

Fundamenta Informaticae - Qualitative Spatial Reasoning
Integration of speech and vision using mutual information

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
Projective relations for 3D space: computational model, application, and psychological evaluation

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hand-engineered definitions of spatial categories are increasingly seen as brittle and spatial concepts in human interactions may need to learn these in terms of perceptually grounded "image schemas". Here, we present a developmental approach for the acquisition of grounded spatial schemas in a perceptual agent. We assume a capability for dynamic visual attention, and perceptual notions of wholeness and proximity. We first learn perceptual-object to linguistic-name mappings from simple 2D multi-agent visual streams co-occurring with word-separated utterances. Mutual information based statistical measures are seen to be sufficient to identify nominal participants in a simple discourse, based on a synthetic model of dynamic visual attention. Next, we use this knowledge of nominals to ground the semantics of spatial relations in language. We show that a notion of proximity between perceptual objects is sufficient to obtain a preverbal notion of graded spatial poses. Once linguistic data is superimposed on this, simple associative structures lead to distinctions such as "in" or "out". Finally we also show how this can lead to a model of actions, where verbs are learned along with the associated argument structures.