Context-Based Multimodal Input Understanding in Conversational Systems

Authors:
Joyce Chai;Shimei Pan;Michelle X. Zhou;Keith Houck
Affiliations:
IBM T.J. Watson Research Center;IBM T.J. Watson Research Center;IBM T.J. Watson Research Center;IBM T.J. Watson Research Center
Venue:
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Year:
2002

Citing 12
Cited 12

Attention, intentions, and the structure of discourse

Computational Linguistics
The application of natural language models to intelligent multimedia

Intelligent multimedia interfaces
Centering: a framework for modeling the local coherence of discourse

Computational Linguistics
Multimodal interfaces for dynamic interactive maps

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
QuickSet: multimodal interaction for distributed applications

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
User and discourse models for multimodal communication

Readings in intelligent user interfaces
Automated authoring of coherent multimedia discourse in conversation systems

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Toward conversational human-computer interaction

AI Magazine
COLLAGEN: A Collaboration Manager for Software Interface Agents

User Modeling and User-Adapted Interaction
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A collaborative planning model of intentional structure

Computational Linguistics
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1

A probabilistic approach to reference resolution in multimodal user interfaces

Proceedings of the 9th international conference on Intelligent user interfaces
A framework for evaluating multimodal integration by humans and a role for embodied conversational agents

Proceedings of the 6th international conference on Multimodal interfaces
An optimization-based approach to dynamic data content selection in intelligent multimedia interfaces

Proceedings of the 17th annual ACM symposium on User interface software and technology
Two-way adaptation for robust input interpretation in practical multimodal conversation systems

Proceedings of the 10th international conference on Intelligent user interfaces
Linguistic theories in efficient multimodal reference resolution: an empirical investigation

Proceedings of the 10th international conference on Intelligent user interfaces
A graph-matching approach to dynamic media allocation in intelligent multimedia interfaces

Proceedings of the 10th international conference on Intelligent user interfaces
Enabling context-sensitive information seeking

Proceedings of the 11th international conference on Intelligent user interfaces
Optimization in multimodal interpretation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Responsive information architect: enabling context-sensitive information seeking

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Performance evaluation and error analysis for multimodal reference resolution in a conversation system

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Conspeakuous: contextualising conversational systems

HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
Behaviors analysis with a game-like learning system for pre-school children

ITS'06 Proceedings of the 8th international conference on Intelligent Tutoring Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a multimodal human-machine conversation, user inputs are often abbreviated or imprecise. Sometimes, only fusing multimodal inputs together cannot derive a complete understanding. To address these inadequacies, we are building a semantics-based multimodal interpretationframework called MIND (Multimodal Interpretation for Natural Dialog). The unique feature of MIND is the use of a variety of contexts (e.g., domain context and conversation context) to enhance multimodal fusion. In this paper, we present a semantic rich modeling scheme and a context-based approach that enable MIND to gain a full understanding of user inputs, including those ambiguous and incomplete ones.