Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures

  • Authors:
  • Hartwig Holzapfel;Kai Nickel;Rainer Stiefelhagen

  • Affiliations:
  • Universität Karlsruhe (TH), Germany;Universität Karlsruhe (TH), Germany;Universität Karlsruhe (TH), Germany

  • Venue:
  • Proceedings of the 6th international conference on Multimodal interfaces
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an architecture for fusion of multimodal input streams for natural interaction with a humanoid robot as well as results from a user study with our system. The presented fusion architecture consists of an application independent parser of input events, and application specific rules. In the presented user study, people could interact with a robot in a kitchen scenario, using speech and gesture input. In the study, we could observe that our fusion approach is very tolerant against falsely detected pointing gestures. This is because we use speech as the main modality and pointing gestures mainly for disambiguation of objects. In the paper we also report about the temporal correlation of speech and gesture events as observed in the user study.