Toward a theory of organized multimodal integration patterns during human-computer interaction

Authors:
Sharon Oviatt;Rachel Coulston;Stefanie Tomko;Benfang Xiao;Rebecca Lunsford;Matt Wesson;Lesley Carmichael
Affiliations:
Oregon Health & Science University, Beaverton, OR;Oregon Health & Science University, Beaverton, OR;Carnegie Mellon University, Pittsburgh, PA;Oregon Health & Science University, Beaverton, OR;Oregon Health & Science University, Beaverton, OR;Oregon Health & Science University, Beaverton, OR;University of Washington, Seattle, WA
Venue:
Proceedings of the 5th international conference on Multimodal interfaces
Year:
2003

Citing 4
Cited 32

Integration and synchronization of input modes during multimodal human-computer interaction

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Ten myths of multimodal interaction

Communications of the ACM
Multimodal interfaces

The human-computer interaction handbook
Modeling multimodal integration patterns and performance in seniors: toward adaptive processing of individual differences

Proceedings of the 5th international conference on Multimodal interfaces

A framework for evaluating multimodal integration by humans and a role for embodied conversational agents

Proceedings of the 6th international conference on Multimodal interfaces
Exploiting prosodic structuring of coverbal gesticulation

Proceedings of the 6th international conference on Multimodal interfaces
When do we interact multimodally?: cognitive load and multimodal communication patterns

Proceedings of the 6th international conference on Multimodal interfaces
Context based multimodal fusion

Proceedings of the 6th international conference on Multimodal interfaces
Individual differences in multimodal integration patterns: what are they and why do they exist?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Children's and adults' multimodal interaction with 2D conversational agents

CHI '05 Extended Abstracts on Human Factors in Computing Systems
Audio-visual cues distinguishing self- from system-directed speech in younger and older adults

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Multimodal interaction styles for hypermedia adaptation

Proceedings of the 11th international conference on Intelligent user interfaces
Multimodal spatial reference in mediated environments: users' preferences and the pragmatics of pointing and talking

CHI '06 Extended Abstracts on Human Factors in Computing Systems
Toward adaptive information fusion in multimodal systems

MMUI '05 Proceedings of the 2005 NICTA-HCSNet Multimodal User Interaction Workshop - Volume 57
Human-centered design meets cognitive load theory: designing interfaces that help people think

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Prototyping novel collaborative multimodal systems: simulation, data collection and analysis tools for the next decade

Proceedings of the 8th international conference on Multimodal interfaces
Optimization in multimodal interpretation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Fusion of children's speech and 2D gestures when conversing with 3D characters

Signal Processing - Special section: Multimodal human-computer interfaces
Multimodal interaction analysis in a smart house

Proceedings of the 9th international conference on Multimodal interfaces
A Gestalt theoretic perspective on the user experience of location-based services

OZCHI '07 Proceedings of the 19th Australasian conference on Computer-Human Interaction: Entertaining User Interfaces
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces

HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
A high-performance dual-wizard infrastructure for designing speech, pen, and multimodal interfaces

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Understanding the user experience of location-based services: five principles of perceptual organisation applied

Journal of Location Based Services
Multimodal Interfaces: A Survey of Principles, Models and Frameworks

Human Machine Interaction
Cognitive principles in robust multimodal interpretation

Journal of Artificial Intelligence Research
Benchmarking fusion engines of multimodal interactive systems

Proceedings of the 2009 international conference on Multimodal interfaces
Temporal aspects of CARE-based multimodal fusion: from a fusion mechanism to composition components and WoZ components

Proceedings of the 2009 international conference on Multimodal interfaces
A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams

Neurocomputing
Investigating effective ECAs: an experiment on modality and initiative

INTERACT'07 Proceedings of the 11th IFIP TC 13 international conference on Human-computer interaction - Volume Part II
Integrating semantics into multimodal interaction patterns

MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
Human-centered visualization environments

Human-centered visualization environments
Temporal binding of multimodal controls for dynamic map displays: a systems approach

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Toward adaptive information fusion in multimodal systems

MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Just do what i tell you: the limited impact of instructions on multimodal integration patterns

UM'05 Proceedings of the 10th international conference on User Modeling
Combining user modeling and machine learning to predict users' multimodal integration patterns

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Efficient software attack to multimodal biometric systems and its application to face and iris fusion

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

As a new generation of multimodal systems begins to emerge, one dominant theme will be the integration and synchronization requirements for combining modalities into robust whole systems. In the present research, quantitative modeling is presented on the organization of users' speech and pen multimodal integration patterns. In particular, the potential malleability of users' multimodal integration patterns is explored, as well as variation in these patterns during system error handling and tasks varying in difficulty. Using a new dual-wizard simulation method, data was collected from twelve adults as they interacted with a map-based task using multimodal speech and pen input. Analyses based on over 1600 multimodal constructions revealed that users' dominant multimodal integration pattern was resistant to change, even when strong selective reinforcement was delivered to encourage switching from a sequential to simultaneous integration pattern, or vice versa. Instead, both sequential and simultaneous integrators showed evidence of entrenching further in their dominant integration patterns (i.e., increasing either their inter-modal lag or signal overlap) over the course of an interactive session, during system error handling, and when completing increasingly difficult tasks. In fact, during error handling these changes in the co-timing of multimodal signals became the main feature of hyper-clear multimodal language, with elongation of individual signals either attenuated or absent. Whereas Behavioral/Structuralist theory cannot account for these data, it is argued that Gestalt theory provides a valuable framework and insights into multimodal interaction. Implications of these findings are discussed for the development of a coherent theory of multimodal integration during human-computer interaction, and for the design of a new class of adaptive multimodal interfaces.