Dynamically structuring, updating and interrelating representations of visual and linguistic discourse context

Authors:
J. Kelleher;F. Costello;J. van Genabith
Affiliations:
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Saarbrücken, Germany;University College Dublin, Ireland;Dublin City University, Ireland
Venue:
Artificial Intelligence - Special volume on connecting language to the world
Year:
2005

Citing 20
Cited 5

Planning english referring expressions

Artificial Intelligence - Lecture notes in computer science 178
Attention, intentions, and the structure of discourse

Computational Linguistics
Artificial fishes: physics, locomotion, perception, behavior

SIGGRAPH '94 Proceedings of the 21st annual conference on Computer graphics and interactive techniques
Centering: a framework for modeling the local coherence of discourse

Computational Linguistics
Discourse interpretation and the scope of operators

Discourse interpretation and the scope of operators
Limited attention and discourse structure

Computational Linguistics
Integration of visuospatial and linguistic information: language comprehension in real time and real space

Representation and processing of spatial expressions
Readings in intelligent user interfaces

Readings in intelligent user interfaces
Integration of Natural Language and Vision Processing

Integration of Natural Language and Vision Processing
Generating referring expressions: boolean extensions of the incremental algorithm

Computational Linguistics
Multimodal Cooperative Resolution of Referential Expressions in the DenK System

CMC '98 Revised Papers from the Second International Conference on Cooperative Multimodal Communication
Generating Referring Expressions in a Multimodal Environment

Proceedings of the 6th International Workshop on Natural Language Generation: Aspects of Automated Natural Language Generation
Fast Synthetic Vision, Memory, and Learning Models for Virtual Humans

CA '99 Proceedings of the Computer Animation
Definite noun phrases and the semantics of discourse

COLING '86 Proceedings of the 11th coference on Computational linguistics
Corpus-based identification of non-anaphoric noun phrases

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Generating minimal definite descriptions

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
On identifying sets

INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Generating vague descriptions

INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Designing the user interface for multimodal speech and pen-based gesture applications: state-of-the-art systems and future research directions

Human-Computer Interaction
Grounded semantic composition for visual scenes

Journal of Artificial Intelligence Research

Natural reference to objects in a visual domain

INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Reference reversibility with reference domain theory

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Computational generation of referring expressions: A survey

Computational Linguistics
Two approaches for generating size modifiers

ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
REX-J: Japanese referring expression corpus of situated dialogs

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The fundamental claim of this paper is that salience-both visual and linguistic-is an important overarching semantic category structuring visually situated discourse. Based on this we argue that computer systems attempting to model the evolving context of a visually situated discourse should integrate models of visual and linguistic salience within their natural language processing (NLP) framework. The paper highlights the importance of dynamically updating and interrelating visual and linguistic discourse context representations. To support our approach, we have developed a real-time, natural language virtual reality (NLVR) system (called LIVE, for Linguistic Interaction with Virtual Environments) that implements an NLP framework based on both visual and linguistic salience. Within this framework saliency information underpins two of the core subtasks of NLP: reference resolution and the generation of referring expressions. We describe the theoretical basis and architecture of the LIVE NLP framework and present extensive evaluation results comparing the system's performance with that of human participants in a number of experiments.