Hands and speech in space: multimodal interaction with augmented reality interfaces

Authors:
Mark Billinghurst
Affiliations:
University of Canterbury, Christchurch, New Zealand
Venue:
Proceedings of the 15th ACM on International conference on multimodal interaction
Year:
2013

Citing 5
Cited 0

Synergistic use of direct manipulation and natural language

CHI '89 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Speech and gesture multimodal control of a whole Earth 3D visualization environment

VISSYM '02 Proceedings of the symposium on Data Visualisation 2002
SenseShapes: Using Statistical Geometry for Object Selection in a Multimodal Augmented Reality System

ISMAR '03 Proceedings of the 2nd IEEE/ACM International Symposium on Mixed and Augmented Reality
An evaluation of an augmented reality multimodal interface using speech and paddle gestures

ICAT'06 Proceedings of the 16th international conference on Advances in Artificial Reality and Tele-Existence
Freeze view touch and finger gesture based interaction methods for handheld augmented reality interfaces

Proceedings of the 27th Conference on Image and Vision Computing New Zealand

Quantified Score

Hi-index	0.00

Visualization

Abstract

Augmented Reality (AR) is technology that allows virtual imagery to be seamlessly integrated into the real world. Although first developed in the 1960's it has only been recently that AR has become widely available, through platforms such as the web and mobile phones. However most AR interfaces have very simple interaction, such as using touch on phone screens or camera tracking from real images. New depth sensing and gesture tracking technologies such as Microsoft Kinect or Leap Motion have made is easier than ever before to track hands in space. Combined with speech recognition and AR tracking and viewing software it is possible to create interfaces that allow users to manipulate 3D graphics in space through a natural combination of speech and gesture. In this paper I will review previous research in multimodal AR interfaces and give an overview of the significant research questions that need to be addressed before speech and gesture interaction can become commonplace.