The logic of typed feature structures
The logic of typed feature structures
Integration and synchronization of input modes during multimodal human-computer interaction
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Communicative Rhythm in Gesture and Speech
GW '99 Proceedings of the International Gesture Workshop on Gesture-Based Communication in Human-Computer Interaction
“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
The Karlsruhe-Verbmobil Speech Recognition Engine
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Multimodal Interaction During Multiparty Dialogues: Initial Results
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A Map-Based System Using Speech and 3D Gestures for Pervasive Computing
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Prosody Based Co-analysis for Continuous Recognition of Coverbal Gestures
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
SmartKom: adaptive and flexible multimodal access to multiple applications
Proceedings of the 5th international conference on Multimodal interfaces
Pointing gesture recognition based on 3D-tracking of face, hands and head orientation
Proceedings of the 5th international conference on Multimodal interfaces
Where is "it"? Event Synchronization in Gaze-Speech Input Systems
Proceedings of the 5th international conference on Multimodal interfaces
Unification-based multimodal integration
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Rapid prototyping for spoken dialogue systems
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
MATCH: an architecture for multimodal dialogue systems
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Multimodal integration-a statistical view
IEEE Transactions on Multimedia
Inferring body pose using speech content
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Put a grammar here: bi-directional parsing in multimodal interaction
CHI '06 Extended Abstracts on Human Factors in Computing Systems
Fusion of children's speech and 2D gestures when conversing with 3D characters
Signal Processing - Special section: Multimodal human-computer interfaces
Visual recognition of pointing gestures for human-robot interaction
Image and Vision Computing
CHI '08 Extended Abstracts on Human Factors in Computing Systems
"Move the couch where?": developing an augmented reality multimodal interface
ISMAR '06 Proceedings of the 5th IEEE and ACM International Symposium on Mixed and Augmented Reality
Clavius: bi-directional parsing for generic multimodal interaction
COLING ACL '06 Proceedings of the 21st International Conference on computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Towards a Multidimensional Approach for the Evaluation of Multimodal Application User Interfaces
Proceedings of the 13th International Conference on Human-Computer Interaction. Part II: Novel Interaction Methods and Techniques
Fusion engines for multimodal input: a survey
Proceedings of the 2009 international conference on Multimodal interfaces
Benchmarking fusion engines of multimodal interactive systems
Proceedings of the 2009 international conference on Multimodal interfaces
Concept-based evidential reasoning for multimodal fusion in human-computer interaction
Applied Soft Computing
Multimodal interaction with an autonomous forklift
Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
A robot learns to know people: first contacts of a robot
KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
An input-parsing algorithm supporting integration of deictic gesture in natural language interface
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
An evaluation of an augmented reality multimodal interface using speech and paddle gestures
ICAT'06 Proceedings of the 16th international conference on Advances in Artificial Reality and Tele-Existence
Using intelligent natural user interfaces to support sales conversations
Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
Speak up your mind: using speech to capture innovative ideas on interactive surfaces
Proceedings of the 10th Brazilian Symposium on on Human Factors in Computing Systems and the 5th Latin American Conference on Human-Computer Interaction
Avaliação da usabilidade de MUI: um estudo de caso
Proceedings of the 10th Brazilian Symposium on on Human Factors in Computing Systems and the 5th Latin American Conference on Human-Computer Interaction
Modeling ontology for multimodal interaction in ubiquitous computing systems
Proceedings of the 2012 ACM Conference on Ubiquitous Computing
Using the transferable belief model for multimodal input fusion in companion systems
MPRSS'12 Proceedings of the First international conference on Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction
Free-hand pointing for identification and interaction with distant objects
Proceedings of the 5th International Conference on Automotive User Interfaces and Interactive Vehicular Applications
Hi-index | 0.00 |
This paper presents an architecture for fusion of multimodal input streams for natural interaction with a humanoid robot as well as results from a user study with our system. The presented fusion architecture consists of an application independent parser of input events, and application specific rules. In the presented user study, people could interact with a robot in a kitchen scenario, using speech and gesture input. In the study, we could observe that our fusion approach is very tolerant against falsely detected pointing gestures. This is because we use speech as the main modality and pointing gestures mainly for disambiguation of objects. In the paper we also report about the temporal correlation of speech and gesture events as observed in the user study.