Deixis and conjunction in multimodal systems

Authors:
Michael Johnston
Affiliations:
AT&T Labs - Research, Shannon Laboratory, Florham Park, NJ
Venue:
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Year:
2000

Citing 9
Cited 6

Algorithm schemata and data structures in syntactic processing

Readings in natural language processing
The logic of typed feature structures

The logic of typed feature structures
How may I help you?

Speech Communication - Special issue on interactive voice technology for telecommunication applications (IVITA '96)
Generating coordinated natural language and 3D animations for complex spatial explanations

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Unification-based multimodal integration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
F-PATR: functional constraints for unification-based grammars

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Finite-state multimodal parsing and understanding

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Multimodal interactive maps: designing for human performance

Human-Computer Interaction

Modality fusion for graphic design applications

Proceedings of the 6th international conference on Multimodal interfaces
Finite-state multimodal integration and understanding

Natural Language Engineering
MATCH: an architecture for multimodal dialogue systems

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Robust gesture processing for multimodal interaction

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Robust understanding in multimodal interfaces

Computational Linguistics
Automatic, context-of-capture-based categorization, structure detection and segmentation of news telecasts

DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development

Quantified Score

Hi-index	0.00

Visualization

Abstract

In order to realize their full potential, multimodal interfaces need to support not just input from multiple modes, but single commands optimally distributed across the available input modes. A multimodal language processing architecture is needed to integrate semantic content from the different modes. Johnston 1998a proposes a modular approach to multimodal language processing in which spoken language parsing is completed before multimodal parsing. In this paper, I will demonstrate the difficulties this approach faces as the spoken language parsing component is expanded to provide a compositional analysis of deictic expressions. I propose an alternative architecture in which spoken and multimodal parsing are tightly interleaved. This architecture greatly simplifies the spoken language parsing grammar and enables predictive information from spoken language parsing to drive the application of multimodal parsing and gesture combination rules. I also propose a treatment of deictic numeral expressions that supports the broad range of pen gesture combinations that can be used to refer to collections of objects in the interface.