Finite-state multimodal parsing and understanding

Authors:
Michael Johnston;Srinivas Bangalore
Affiliations:
AT&T Labs - Research, Shannon Laboratory, Florham Park, NJ;AT&T Labs - Research, Shannon Laboratory, Florham Park, NJ
Venue:
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Year:
2000

Citing 17
Cited 37

Intelligent multi-media interface technology

Intelligent user interfaces
The logic of typed feature structures

The logic of typed feature structures
Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
Complexity of lexical descriptions and its relevance to partial parsing

Complexity of lexical descriptions and its relevance to partial parsing
Multimodal interaction for distributed interactive simulation

Readings in intelligent user interfaces
Mutual disambiguation of recognition errors in a multimodel architecture

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Finite state transducers: parsing free and frozen sentences

Extended finite state models of language
A Rational Design for a Weighted Finite-State Transducer Library

WIA '97 Revised Papers from the Second International Workshop on Implementing Automata
“Put-that-there”: Voice and gesture at the graphics interface

SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Finite-state transducers in language and speech processing

Computational Linguistics
A parser from antiquity

Natural Language Engineering
Unification-based multimodal integration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Finite-state approximation of constraint-based grammars using left-corner grammar transforms

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Stochastic finite-state models for spoken language machine translation

NAACL-ANLP-EMTS '00 Proceedings of the 2000 NAACL-ANLP Workshop on Embedded machine translation systems - Volume 5
Multimodal interactive maps: designing for human performance

Human-Computer Interaction
Multimodal integration-a statistical view

IEEE Transactions on Multimedia

Shared reality: spatial intelligence in intuitive user interfaces

Proceedings of the 7th international conference on Intelligent user interfaces
Multimodal event parsing for intelligent user interfaces

Proceedings of the 8th international conference on Intelligent user interfaces
Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Context-Sensitive Help for Multimodal Dialogue

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A probabilistic approach to reference resolution in multimodal user interfaces

Proceedings of the 9th international conference on Intelligent user interfaces
Exploiting emotions to disambiguate dialogue acts

Proceedings of the 9th international conference on Intelligent user interfaces
Deixis and conjunction in multimodal systems

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Modality fusion for graphic design applications

Proceedings of the 6th international conference on Multimodal interfaces
Linguistic theories in efficient multimodal reference resolution: an empirical investigation

Proceedings of the 10th international conference on Intelligent user interfaces
Finite-state multimodal integration and understanding

Natural Language Engineering
MATCH: an architecture for multimodal dialogue systems

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Multimodal error correction for continuous handwriting recognition in pen-based user interfaces

Proceedings of the 11th international conference on Intelligent user interfaces
Put a grammar here: bi-directional parsing in multimodal interaction

CHI '06 Extended Abstracts on Human Factors in Computing Systems
Salience modeling based on non-verbal modalities for spoken language understanding

Proceedings of the 8th international conference on Multimodal interfaces
Multimodal fusion: a new hybrid strategy for dialogue systems

Proceedings of the 8th international conference on Multimodal interfaces
Optimization in multimodal interpretation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
MATCHKiosk: a multimodal interactive city guide

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Semantic back-pointers from gesture

NAACL-DocConsortium '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: doctoral consortium
A novel method for multi-sensory data fusion in multimodal human computer interaction

OZCHI '06 Proceedings of the 18th Australia conference on Computer-Human Interaction: Design: Activities, Artefacts and Environments
What's in a gaze?: the role of eye-gaze in reference resolution in multimodal conversational interfaces

Proceedings of the 13th international conference on Intelligent user interfaces
An integrative recognition method for speech and gestures

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Robust gesture processing for multimodal interaction

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Clavius: bi-directional parsing for generic multimodal interaction

COLING ACL '06 Proceedings of the 21st International Conference on computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
A multimodal pervasive framework for ambient assisted living

Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
Performance evaluation and error analysis for multimodal reference resolution in a conversation system

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Robust understanding in multimodal interfaces

Computational Linguistics
Cognitive principles in robust multimodal interpretation

Journal of Artificial Intelligence Research
Gesture salience as a hidden variable for coreference resolution and keyframe extraction

Journal of Artificial Intelligence Research
Cross-modality semantic integration with hypothesis rescoring for robust interpretation of multimodal user interactions

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Usage patterns and latent semantic analyses for task goal inference of multimodal user interactions

Proceedings of the 15th international conference on Intelligent user interfaces
Integrating multimodal cues using grammar based models

UAHCI'07 Proceedings of the 4th international conference on Universal access in human-computer interaction: ambient interaction
An input-parsing algorithm supporting integration of deictic gesture in natural language interface

HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
A hybrid grammar-based approach to multimodal languages specification

OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems - Volume Part I
Multimodal behavior realization for embodied conversational agents

Multimedia Tools and Applications
A multimodal reference resolution approach in virtual environment

VSMM'06 Proceedings of the 12th international conference on Interactive Technologies and Sociotechnical Systems
Mutual disambiguation of eye gaze and speech for sight translation and reading

Proceedings of the 6th workshop on Eye gaze in intelligent human machine interaction: gaze in multimodal interaction
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.02

Visualization

Abstract

Multimodal interfaces require effective parsing and understanding of utterances whose content is distributed across multiple input modes. Johnston 1998 presents an approach in which strategies for multimodal integration are stated declaratively using a unification-based grammar that is used by a multi-dimensional chart parser to compose inputs. This approach is highly expressive and supports a broad class of interfaces, but offers only limited potential for mutual compensation among the input modes, is subject to significant concerns in terms of computational complexity, and complicates selection among alternative multimodal interpretations of the input. In this paper, we present an alternative approach in which multimodal parsing and understanding are achieved using a weighted finite-state device which takes speech and gesture streams as inputs and outputs their joint interpretation. This approach is significantly more efficient, enables tight-coupling of multimodal understanding with speech recognition, and provides a general probabilistic framework for multimodal ambiguity resolution.