Resolution of referring expressions in a Korean multimodal dialogue system

Authors:
Harksoo Kim;Jungyun Seo
Affiliations:
Diquest Research Center;Sogang University
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2003

Citing 9
Cited 1

Attention, intentions, and the structure of discourse

Computational Linguistics
Talk and Draw: Bundling Speech and Graphics

Computer
“Put-that-there”: Voice and gesture at the graphics interface

SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Unification-based multimodal integration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A centering approach to pronouns

ACL '87 Proceedings of the 25th annual meeting on Association for Computational Linguistics
Providing a unified account of definite noun phrases in discourse

ACL '83 Proceedings of the 21st annual meeting on Association for Computational Linguistics
Evaluating discourse processing algorithms

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Multi-Modal Definite Clause Grammar

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Multi-Modal-Method: a design method for building multi-modal systems

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2

References to graphical objects in interactive multimodal queries

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Referring expressions in multimodal dialogues have different aspects compared to those in language-only dialogues. They often refer to the items signified by either a gesture or visual means. In this article we classify referring expressions into two types (i.e., a deictic reference and an anaphoric reference), and propose two general methods to resolve these referring expressions. One method is a simple mapping algorithm that can find items referred with/without pointing gestures on a screen. The other is the centering algorithm with a dual cache model, to which Walker's centering algorithm is extended for a multimodal dialogue system. The extended algorithm is appropriate for resolving various anaphoric references in a multimodal dialogue. In the experiments, the proposed system correctly resolved 376 out of 405 referring expressions in 40 dialogues (0.54 referring expressions per utterance) showing 92.84 percent correctness.