Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality

Authors:
Ed Kaiser;Alex Olwal;David McGee;Hrvoje Benko;Andrea Corradini;Xiaoguang Li;Phil Cohen;Steven Feiner
Affiliations:
Oregon Health and Science University, Beaverton, OR;Columbia University, New York, NY;Pacific Northwest National Laboratory, Richland, WA;Columbia University, New York, NY;Oregon Health and Science University, Beaverton, OR;Oregon Health and Science University, Beaverton, OR;Oregon Health and Science University, Beaverton, OR;Columbia University, New York, NY
Venue:
Proceedings of the 5th international conference on Multimodal interfaces
Year:
2003

Citing 22
Cited 51

A synthetic visual environment with hand gesturing and voice input

CHI '89 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Two-handed gesture in multi-modal natural dialog

UIST '92 Proceedings of the 5th annual ACM symposium on User interface software and technology
Integrating simultaneous input from speech, gaze, and hand gestures

Intelligent multimedia interfaces
The go-go interaction technique: non-linear mapping for direct manipulation in VR

Proceedings of the 9th annual ACM symposium on User interface software and technology
QuickSet: multimodal interaction for distributed applications

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Mutual disambiguation of recognition errors in a multimodel architecture

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Speech and gesture multimodal control of a whole Earth 3D visualization environment

VISSYM '02 Proceedings of the symposium on Data Visualisation 2002
Multimodal Interaction for 2D and 3D Environments

IEEE Computer Graphics and Applications
Cubic-Mouse-Based Interaction in Virtual Environments

IEEE Computer Graphics and Applications
XWand: UI for intelligent spaces

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Gesture + play: full-body interaction for virtual environments

CHI '03 Extended Abstracts on Human Factors in Computing Systems
A method of interactive visualization of CAD surface models on a color video display

SIGGRAPH '81 Proceedings of the 8th annual conference on Computer graphics and interactive techniques
“Put-that-there”: Voice and gesture at the graphics interface

SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Tracking Focus of Attention in Meetings

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
The Adaptive Agent Architecture: Achieving Fault-Tolerance Using Persistent Broker Teams

ICMAS '00 Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)
SenseShapes: Using Statistical Geometry for Object Selection in a Multimodal Augmented Reality System

ISMAR '03 Proceedings of the 2nd IEEE/ACM International Symposium on Mixed and Augmented Reality
Parallel coordinates: a tool for visualizing multi-dimensional geometry

VIS '90 Proceedings of the 1st conference on Visualization '90
Unification-based multimodal integration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Confirmation in multimodal systems

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Multimodal integration-a statistical view

IEEE Transactions on Multimedia

SenseShapes: Using Statistical Geometry for Object Selection in a Multimodal Augmented Reality System

ISMAR '03 Proceedings of the 2nd IEEE/ACM International Symposium on Mixed and Augmented Reality
Exploiting prosodic structuring of coverbal gesticulation

Proceedings of the 6th international conference on Multimodal interfaces
Visual and linguistic information in gesture classification

Proceedings of the 6th international conference on Multimodal interfaces
A multimodal learning interface for sketch, speak and point creation of a schedule chart

Proceedings of the 6th international conference on Multimodal interfaces
Collaborative Mixed Reality Visualization of an Archaeological Excavation

ISMAR '04 Proceedings of the 3rd IEEE/ACM International Symposium on Mixed and Augmented Reality
Multimodal new vocabulary recognition through speech and handwriting in a whiteboard scheduling application

Proceedings of the 10th international conference on Intelligent user interfaces
A rapid prototyping software infrastructure for user interfaces in ubiquitous augmented reality

Personal and Ubiquitous Computing
Distributed pointing for multimodal collaboration over sketched diagrams

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Inferring body pose using speech content

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
A user interface framework for multimodal VR interactions

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Supporting interaction in augmented reality in the presence of uncertain spatial knowledge

Proceedings of the 18th annual ACM symposium on User interface software and technology
Spatial ontology for semantic integration in 3D multimodal interaction framework

Proceedings of the 2006 ACM international conference on Virtual reality continuum and its applications
Multimodal Interaction with a Wearable Augmented Reality System

IEEE Computer Graphics and Applications
Semantic 3D object manipulation using object ontology in multimodal interaction framework

Proceedings of the 2005 international conference on Augmented tele-existence
Masterpiece: Physical Interaction and 3D Content-Based Search in VR Applications

IEEE MultiMedia
Visual and linguistic information in gesture classification

ACM SIGGRAPH 2006 Courses
Multi-human dialogue understanding for assisting artifact-producing meetings

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
From a wizard of Oz experiment to a real time speech and gesture multimodal interface

Signal Processing - Special section: Multimodal human-computer interfaces
Fusion of children's speech and 2D gestures when conversing with 3D characters

Signal Processing - Special section: Multimodal human-computer interfaces
Head gestures for perceptual interfaces: The role of context in improving recognition

Artificial Intelligence
Visual and linguistic information in gesture classification

ACM SIGGRAPH 2007 courses
A computational model for spatial expression resolution

Proceedings of the 9th international conference on Multimodal interfaces
Speech-filtered bubble ray: improving target acquisition on display walls

Proceedings of the 9th international conference on Multimodal interfaces
Towards adaptive object recognition for situated human-computer interaction

Proceedings of the 2007 workshop on Multimodal interfaces in semantic interaction
Towards multimodal human-robot interaction in large scale virtual environment

Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
Towards a minimalist multimodal dialogue framework using recursive MVC pattern

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Context-based recognition during human interactions: automatic feature selection and encoding dictionary

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Ubiquitous animated agents for augmented reality

ISMAR '06 Proceedings of the 5th IEEE and ACM International Symposium on Mixed and Augmented Reality
"Move the couch where?": developing an augmented reality multimodal interface

ISMAR '06 Proceedings of the 5th IEEE and ACM International Symposium on Mixed and Augmented Reality
High level data fusion on a multimodal interactive application platform

Proceedings of the 1st ACM SIGCHI symposium on Engineering interactive computing systems
Agent-customized training for human learning performance enhancement

Computers & Education
Advanced Interaction Techniques for Augmented Reality Applications

VMR '09 Proceedings of the 3rd International Conference on Virtual and Mixed Reality: Held as Part of HCI International 2009
Concept-based evidential reasoning for multimodal fusion in human-computer interaction

Applied Soft Computing
Multimedia multimodal methodologies

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
A framework for designing adaptative systems in VR applications

CCNC'09 Proceedings of the 6th IEEE Conference on Consumer Communications and Networking Conference
Co-occurrence graphs: contextual representation for head gesture recognition during multi-party interactions

Proceedings of the Workshop on Use of Context in Vision Processing
Palm-on haptic environment in augmented reality

HCI '08 Proceedings of the Third IASTED International Conference on Human Computer Interaction
VR-CAD integration: Multimodal immersive interaction and advanced haptic paradigms for implicit edition of CAD models

Computer-Aided Design
Static and dynamic hand-gesture recognition for augmented reality applications

HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
Object category recognition using probabilistic fusion of speech and image classifiers

MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
The recognition and comprehension of hand gestures: a review and research agenda

ZiF'06 Proceedings of the Embodied communication in humans and machines, 2nd ZiF research group international conference on Modeling communication with robots and virtual humans
Motion-based perceptual user interface

IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
Toward natural interaction in the real world: real-time gesture recognition

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Reaching the same point: Effects on consistency when pointing at objects in the physical environment without feedback

International Journal of Human-Computer Studies
Multi-point interactions with immersive omnidirectional visualizations in a dome

ACM International Conference on Interactive Tabletops and Surfaces
Human-centered visualization environments

Human-centered visualization environments
A multimodal reference resolution approach in virtual environment

VSMM'06 Proceedings of the 12th international conference on Interactive Technologies and Sociotechnical Systems
An evaluation of an augmented reality multimodal interface using speech and paddle gestures

ICAT'06 Proceedings of the 16th international conference on Advances in Artificial Reality and Tele-Existence
Utilizing visual attention for cross-modal coreference interpretation

CONTEXT'05 Proceedings of the 5th international conference on Modeling and Using Context
Still looking: investigating seamless gaze-supported selection, positioning, and manipulation of distant targets

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Audio-visual speech recognition techniques in augmented reality environments

The Visual Computer: International Journal of Computer Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe an approach to 3D multimodal interaction in immersive augmented and virtual reality environments that accounts for the uncertain nature of the information sources. The resulting multimodal system fuses symbolic and statistical information from a set of 3D gesture, spoken language, and referential agents. The referential agents employ visible or invisible volumes that can be attached to 3D trackers in the environment, and which use a time-stamped history of the objects that intersect them to derive statistics for ranking potential referents. We discuss the means by which the system supports mutual disambiguation of these modalities and information sources, and show through a user study how mutual disambiguation accounts for over 45% of the successful 3D multimodal interpretations. An accompanying video demonstrates the system in action.