Algorithm schemata and data structures in syntactic processing
Readings in natural language processing
The logic of typed feature structures
The logic of typed feature structures
Speech Communication - Special issue on interactive voice technology for telecommunication applications (IVITA '96)
Generating coordinated natural language and 3D animations for complex spatial explanations
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Unification-based multimodal integration
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
F-PATR: functional constraints for unification-based grammars
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Finite-state multimodal parsing and understanding
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Multimodal interactive maps: designing for human performance
Human-Computer Interaction
Modality fusion for graphic design applications
Proceedings of the 6th international conference on Multimodal interfaces
Finite-state multimodal integration and understanding
Natural Language Engineering
MATCH: an architecture for multimodal dialogue systems
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Robust gesture processing for multimodal interaction
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Robust understanding in multimodal interfaces
Computational Linguistics
DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
Hi-index | 0.00 |
In order to realize their full potential, multimodal interfaces need to support not just input from multiple modes, but single commands optimally distributed across the available input modes. A multimodal language processing architecture is needed to integrate semantic content from the different modes. Johnston 1998a proposes a modular approach to multimodal language processing in which spoken language parsing is completed before multimodal parsing. In this paper, I will demonstrate the difficulties this approach faces as the spoken language parsing component is expanded to provide a compositional analysis of deictic expressions. I propose an alternative architecture in which spoken and multimodal parsing are tightly interleaved. This architecture greatly simplifies the spoken language parsing grammar and enables predictive information from spoken language parsing to drive the application of multimodal parsing and gesture combination rules. I also propose a treatment of deictic numeral expressions that supports the broad range of pen gesture combinations that can be used to refer to collections of objects in the interface.