An exploration of gesture-speech multimodal patterns for touch interfaces

Authors:
Prasenjit Dey;Sriganesh Madhvanath;Amit Ranjan;Suvodeep Das
Affiliations:
Hewlett-Packard Labs, Salarpuria Arena, Adugodi, Bangalore, India;Hewlett-Packard Labs, Salarpuria Arena, Adugodi, Bangalore, India;Hewlett-Packard Labs, Salarpuria Arena, Adugodi, Bangalore, India;KHB Colony, Kormangla, Bangalore, India
Venue:
Proceedings of the 3rd International Conference on Human Computer Interaction
Year:
2011

Citing 5
Cited 1

Mutual disambiguation of recognition errors in a multimodel architecture

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
“Put-that-there”: Voice and gesture at the graphics interface

SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
QuickSet: multimodal interaction for simulation set-up and control

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
MATCH: an architecture for multimodal dialogue systems

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
SmartKom: Foundations of Multimodal Dialogue Systems (Cognitive Technologies)

SmartKom: Foundations of Multimodal Dialogue Systems (Cognitive Technologies)

Designing multiuser multimodal gestural interactions for the living room

Proceedings of the 14th ACM international conference on Multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multimodal interfaces that integrate multiple input modalities such as speech, gestures, gaze, and so on have shown considerable promise in terms of higher task efficiency, lower error rates and higher user satisfaction. However, the adoption of such interfaces for real-world systems has proved to be slow, and the reasons may be both technological (e.g. accuracy of recognition engines, fusion engines, authoring) as well as usability-related. In this paper, we explore a few patterns of "command and control" style multimodal interaction (MMI) using touch gestures and short speech utterances. We then describe a multimodal interface for a photo browsing application and a user study to understand some of the usability issues with such interfaces. Specifically, we study walk-up use of multimodal commands for photo manipulations, and compare this with unimodal multi-touch interactions. We observe that there is a learning period after which the user gets more comfortable with the multimodal commands, and the average task completions times reduce significantly. We also analyze temporal integration patterns of speech and touch gestures. We see this as the first of many studies leading to more detailed understanding of user preferences and performance for using MMI, which can help inform the judicious use of MMI in designing interactions for future interfaces.