Intelligent multi-media interface technology
Intelligent user interfaces
The logic of typed feature structures
The logic of typed feature structures
Regular models of phonological rule systems
Computational Linguistics - Special issue on computational phonology
Complexity of lexical descriptions and its relevance to partial parsing
Complexity of lexical descriptions and its relevance to partial parsing
Multimodal interaction for distributed interactive simulation
Readings in intelligent user interfaces
Mutual disambiguation of recognition errors in a multimodel architecture
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Finite state transducers: parsing free and frozen sentences
Extended finite state models of language
A Rational Design for a Weighted Finite-State Transducer Library
WIA '97 Revised Papers from the Second International Workshop on Implementing Automata
“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Finite-state transducers in language and speech processing
Computational Linguistics
Natural Language Engineering
Unification-based multimodal integration
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Finite-state approximation of constraint-based grammars using left-corner grammar transforms
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Stochastic finite-state models for spoken language machine translation
NAACL-ANLP-EMTS '00 Proceedings of the 2000 NAACL-ANLP Workshop on Embedded machine translation systems - Volume 5
Multimodal interactive maps: designing for human performance
Human-Computer Interaction
Multimodal integration-a statistical view
IEEE Transactions on Multimedia
Shared reality: spatial intelligence in intuitive user interfaces
Proceedings of the 7th international conference on Intelligent user interfaces
Multimodal event parsing for intelligent user interfaces
Proceedings of the 8th international conference on Intelligent user interfaces
Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Context-Sensitive Help for Multimodal Dialogue
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A probabilistic approach to reference resolution in multimodal user interfaces
Proceedings of the 9th international conference on Intelligent user interfaces
Exploiting emotions to disambiguate dialogue acts
Proceedings of the 9th international conference on Intelligent user interfaces
Deixis and conjunction in multimodal systems
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Modality fusion for graphic design applications
Proceedings of the 6th international conference on Multimodal interfaces
Linguistic theories in efficient multimodal reference resolution: an empirical investigation
Proceedings of the 10th international conference on Intelligent user interfaces
Finite-state multimodal integration and understanding
Natural Language Engineering
MATCH: an architecture for multimodal dialogue systems
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Multimodal error correction for continuous handwriting recognition in pen-based user interfaces
Proceedings of the 11th international conference on Intelligent user interfaces
Put a grammar here: bi-directional parsing in multimodal interaction
CHI '06 Extended Abstracts on Human Factors in Computing Systems
Salience modeling based on non-verbal modalities for spoken language understanding
Proceedings of the 8th international conference on Multimodal interfaces
Multimodal fusion: a new hybrid strategy for dialogue systems
Proceedings of the 8th international conference on Multimodal interfaces
Optimization in multimodal interpretation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
MATCHKiosk: a multimodal interactive city guide
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Semantic back-pointers from gesture
NAACL-DocConsortium '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: doctoral consortium
A novel method for multi-sensory data fusion in multimodal human computer interaction
OZCHI '06 Proceedings of the 18th Australia conference on Computer-Human Interaction: Design: Activities, Artefacts and Environments
Proceedings of the 13th international conference on Intelligent user interfaces
An integrative recognition method for speech and gestures
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Robust gesture processing for multimodal interaction
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Clavius: bi-directional parsing for generic multimodal interaction
COLING ACL '06 Proceedings of the 21st International Conference on computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
A multimodal pervasive framework for ambient assisted living
Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Robust understanding in multimodal interfaces
Computational Linguistics
Cognitive principles in robust multimodal interpretation
Journal of Artificial Intelligence Research
Gesture salience as a hidden variable for coreference resolution and keyframe extraction
Journal of Artificial Intelligence Research
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Usage patterns and latent semantic analyses for task goal inference of multimodal user interactions
Proceedings of the 15th international conference on Intelligent user interfaces
Integrating multimodal cues using grammar based models
UAHCI'07 Proceedings of the 4th international conference on Universal access in human-computer interaction: ambient interaction
An input-parsing algorithm supporting integration of deictic gesture in natural language interface
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
A hybrid grammar-based approach to multimodal languages specification
OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems - Volume Part I
Multimodal behavior realization for embodied conversational agents
Multimedia Tools and Applications
A multimodal reference resolution approach in virtual environment
VSMM'06 Proceedings of the 12th international conference on Interactive Technologies and Sociotechnical Systems
Mutual disambiguation of eye gaze and speech for sight translation and reading
Proceedings of the 6th workshop on Eye gaze in intelligent human machine interaction: gaze in multimodal interaction
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.02 |
Multimodal interfaces require effective parsing and understanding of utterances whose content is distributed across multiple input modes. Johnston 1998 presents an approach in which strategies for multimodal integration are stated declaratively using a unification-based grammar that is used by a multi-dimensional chart parser to compose inputs. This approach is highly expressive and supports a broad class of interfaces, but offers only limited potential for mutual compensation among the input modes, is subject to significant concerns in terms of computational complexity, and complicates selection among alternative multimodal interpretations of the input. In this paper, we present an alternative approach in which multimodal parsing and understanding are achieved using a weighted finite-state device which takes speech and gesture streams as inputs and outputs their joint interpretation. This approach is significantly more efficient, enables tight-coupling of multimodal understanding with speech recognition, and provides a general probabilistic framework for multimodal ambiguity resolution.