Neat versus scruffy: a review of computational models for spatial expressions
Representation and processing of spatial expressions
Toward the simulation of spatial mental images using the Voronoi¨ model
Representation and processing of spatial expressions
A New Way to Represent the Relative Position between Areal Objects
IEEE Transactions on Pattern Analysis and Machine Intelligence
Function, geometry and spatial prepositions: Threeexperiments
Spatial Cognition and Computation
Models of bottom-up and top-down visual attention
Models of bottom-up and top-down visual attention
Qualitative Spatial Representation and Reasoning: An Overview
Fundamenta Informaticae - Qualitative Spatial Reasoning
Integration of speech and vision using mutual information
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
Projective relations for 3D space: computational model, application, and psychological evaluation
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Hi-index | 0.00 |
Hand-engineered definitions of spatial categories are increasingly seen as brittle and spatial concepts in human interactions may need to learn these in terms of perceptually grounded "image schemas". Here, we present a developmental approach for the acquisition of grounded spatial schemas in a perceptual agent. We assume a capability for dynamic visual attention, and perceptual notions of wholeness and proximity. We first learn perceptual-object to linguistic-name mappings from simple 2D multi-agent visual streams co-occurring with word-separated utterances. Mutual information based statistical measures are seen to be sufficient to identify nominal participants in a simple discourse, based on a synthetic model of dynamic visual attention. Next, we use this knowledge of nominals to ground the semantics of spatial relations in language. We show that a notion of proximity between perceptual objects is sufficient to obtain a preverbal notion of graded spatial poses. Once linguistic data is superimposed on this, simple associative structures lead to distinctions such as "in" or "out". Finally we also show how this can lead to a model of actions, where verbs are learned along with the associated argument structures.