Toward adaptive information fusion in multimodal systems

Authors:
Xiao Huang;Sharon Oviatt
Affiliations:
Center for Human-Computer Communication, Computer Science Department, Oregon Health and Science University, Beaverton, OR;Center for Human-Computer Communication, Computer Science Department, Oregon Health and Science University, Beaverton, OR
Venue:
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Year:
2005

Citing 11
Cited 3

Integration and synchronization of input modes during multimodal human-computer interaction

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
A tutorial on learning with Bayesian networks

Learning in graphical models
Ten myths of multimodal interaction

Communications of the ACM
Multimodal interfaces

The human-computer interaction handbook
Toward a theory of organized multimodal integration patterns during human-computer interaction

Proceedings of the 5th international conference on Multimodal interfaces
Modeling multimodal integration patterns and performance in seniors: toward adaptive processing of individual differences

Proceedings of the 5th international conference on Multimodal interfaces
When do we interact multimodally?: cognitive load and multimodal communication patterns

Proceedings of the 6th international conference on Multimodal interfaces
Dynamical systems trees

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Layered representations for learning and inferring office activity from multiple sensory channels

Computer Vision and Image Understanding - Special issue on event detection in video
Individual differences in multimodal integration patterns: what are they and why do they exist?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Multimodal authentication using asynchronous HMMs

AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication

HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces

HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
Combining user modeling and machine learning to predict users' multimodal integration patterns

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Multisensor data fusion: A review of the state-of-the-art

Information Fusion

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, a new generation of multimodal systems has emerged as a major direction within the HCI community. Multimodal interfaces and architectures are time-critical and data-intensive to develop, which poses new research challenges. The goal of the present work is to model and adapt to users' multimodal integration patterns, so that faster and more robust systems can be developed with on-line adaptation to individual's multimodal temporal thresholds. In this paper, we summarize past user-modeling results on speech and pen multimodal integration patterns, which indicate that there are two dominant types of multimodal integration pattern among users that can be detected very early and remain highly consistent. The empirical results also indicate that, when interacting with a multimodal system, users intermix unimodal with multimodal commands. Based on these results, we present new machine-learning results comparing three models of on-line system adaptation to users' integration patterns, which were based on Bayesian Belief Networks. This work utilized data from ten adults who provided approximately 1,000 commands while interacting with a map-based multimodal system. Initial experimental results with our learning models indicated that 85% of users' natural mixed input could be correctly classified as either unimodal or multimodal, and 82% of users' mulitmodal input could be correctly classified as either sequentially or simultaneously integrated. The long-term goal of this research is to develop new strategies for combining empirical user modeling with machine learning techniques to bootstrap accelerated, generalized, and improved reliability of information fusion in new types of multimodal system.