A model for multimodal reference resolution

Authors:
Luis. A. Pineda;E. Gabriela Garza
Affiliations:
Unit of Informatic Systems, Cuernavaca, Mor., México;Unit of Informatic Systems, Cuernavaca, Mor., México
Venue:
ReferringPhenomena '97 Referring Phenomena in a Multimedia Context and their Computational Treatment
Year:
1997

Citing 1
Cited 1

Graflog: a theory of semantics for graphics with applications to human-computer interaction and cad systems

Graflog: a theory of semantics for graphics with applications to human-computer interaction and cad systems

Finding and Labeling the Subject of a Captioned Depictive Natural Photograph

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper a discussion on multimodal referent resolution is presented. The discussion is centered on the analysis of how the referent of an expression in one modality can be found whenever the contextual information required for carrying on such an inference is expressed in one or more different modalities. In particular, a model for identifying the referent of a graphical expression when the relevant contextual information is expressed through natural language is presented. The model is also applied to the reciprocal problem of identifying the referent of a linguistic expression whenever a graphical context is given. In Section 1 of this paper the notion of modality in terms of which the theory is developed is presented. The discussion is motivated with a case of study in multimodal reference resolution. In Section 2 a theory for multimodal representation along the lines of Montague's semiotic programme is presented. In Section 3, an incremental model for multimodal reference resolution is illustrated. In Section 4 a brief discussion of how the theory could be extended to handle multimodal discourse is advanced. Finally, in the conclusion of the paper, a reflexion on the relation between spacial deixis and anaphora is advanced.