Temporal binding of multimodal controls for dynamic map displays: a systems approach

Authors:
Ellen C. Haas;Krishna S. Pillalamarri;Chris C. Stachowiak;Gardner McCullough
Affiliations:
Human Research and Engineering Directorate, Aberdeen Proving Ground, MD, USA;Human Research and Engineering Directorate, Aberdeen Proving Ground, MD, USA;Human Research and Engineering Directorate, Aberdeen Proving Ground, MD, USA;University of Maryland Baltimore County, Baltimore, MD, USA
Venue:
ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Year:
2011

Citing 7
Cited 0

Multimodal interfaces for dynamic interactive maps

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Multimodal system processing in mobile environments

UIST '00 Proceedings of the 13th annual ACM symposium on User interface software and technology
Toward a theory of organized multimodal integration patterns during human-computer interaction

Proceedings of the 5th international conference on Multimodal interfaces
When do we interact multimodally?: cognitive load and multimodal communication patterns

Proceedings of the 6th international conference on Multimodal interfaces
Individual differences in multimodal integration patterns: what are they and why do they exist?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Understanding age differences in PDA acceptance and performance

Computers in Human Behavior
Multimodal interaction: A suitable strategy for including older users?

Interacting with Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dynamic map displays are visual interfaces that show the spatial positions of objects of interest (e.g., people, robots, vehicles), and can be updated with user commands as well as world changes, often in real time. Multimodal (speech and touch) controls were designed for a U.S. Army Research Laboratory dynamic map display to allow users to provide supervisory control of a simulated robotic swarm. This study characterized the effects of user performance (input difficulty, modality preference, and response to different levels of workload) on multimodal intercommand time (i.e., temporal binding), and explored how this might relate to the system's ability to bind or fuse user multimodal inputs into a unitary response. User performance was tested in a laboratory study using 6 male and 6 female volunteers with a mean age of 26 years. Results showed that 64% of all participants used speech commands first 100% of the time, while the remaining used touch commands first 100% of the time. Temporal binding between touch and voice commands was significantly shorter for touch-first than for speech-first commands, no matter what the level of workload. For both speech and touch commands, temporal binding was significantly shorter for both roads and swarm edges than for intersections. Results indicated that all of these factors can be significant in relating to a system's ability to bind multimodal inputs into a unitary response. Suggestions for future research are described.