A model for multimodal reference resolution

Authors:
Luis Pineda;Gabriela Garza
Affiliations:
National Autonomous University of Mexico (UNAM);-
Venue:
Computational Linguistics
Year:
2000

Citing 17
Cited 6

Automatic design of graphical presentations

Automatic design of graphical presentations
User and discourse models for multimodal communication

Intelligent user interfaces
Plan-based integration of natural language and graphics generation

Artificial Intelligence - Special volume on natural language processing
Planning multimedia explanations using communicative acts

Intelligent multimedia interfaces
Automating the generation of coordinated multimedia explanations

Intelligent multimedia interfaces
ALFRESCO: Enjoying the combination of natural language processing and hypermedia for information exploration

Intelligent multimedia interfaces
Participating in explanatory dialogues: interpreting and responding to questions in context

Participating in explanatory dialogues: interpreting and responding to questions in context
Discourse interpretation and the scope of operators

Discourse interpretation and the scope of operators
Providing advice for multimedia designers

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Maintaining knowledge about temporal intervals

Communications of the ACM
Semantics and graphical information

INTERACT '90 Proceedings of the IFIP TC13 Third Interational Conference on Human-Computer Interaction
Visual Language Parsing: If I Had a Hammer...

Multimodal Human-Computer Communication, Systems, Techniques, and Experiments
The Logic of Depiction

The Logic of Depiction
Computational geometry.

Computational geometry.
Graflog: a theory of semantics for graphics with applications to human-computer interaction and cad systems

Graflog: a theory of semantics for graphics with applications to human-computer interaction and cad systems
Referring to world objects with text and pictures

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A fast algorithm for the generation of referring expressions

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1

The DIME Project

MICAI '02 Proceedings of the Second Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Conservation principles and action schemes in the synthesis of geometric concepts

Artificial Intelligence
References to graphical objects in interactive multimodal queries

Knowledge-Based Systems
Learning physics as coherently packaging multiple sets of signs

ICLS '10 Proceedings of the 9th International Conference of the Learning Sciences - Volume 1
A formal scheme for multimodal grammars

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A multimodal reference resolution approach in virtual environment

VSMM'06 Proceedings of the 12th international conference on Interactive Technologies and Sociotechnical Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important aspect of the interpretation of multimodal messages is the ability to identify when the same object in the world is the referent of symbols in different modalities. To understand the caption of a picture, for instance, one needs to identify the graphical symbols that are referred to by names and pronouns in the natural language text. One way to think of this problem is in terms of the notion of anaphora; however, unlike linguistic anaphoric inference, in which antecedents for pronouns are selected from a linguistic context, in the interpretation of the textual part of multimodal messages the antecedents are selected from a graphical context. Under this view, resolving multimodal references is like resolving anaphora across modalities. Another way to see the same problem is to look at pronouns in texts about drawings as deictic. In this second view, the context of interpretation of a natural language term is defined as a set of expressions of a graphical language with well-defined syntax and semantics. Natural language and graphical terms are thought of as standing in a relation of translation similar to the translation relation that holds between natural languages. In this paper a theory based on this second view is presented. In this theory, the relations between multimodal representation and spatial deixis, on the one hand, and multimodal reasoning and deictic inference, on the other, are discussed. An integrated model of anaphoric and deictic resolution in the context of the interpretation of multimodal discourse is also advanced.