Multi-modal features for real-time detection of human-robot interaction categories

  • Authors:
  • Ian R. Fasel;Masahiro Shiomi;Pilippe-Emmanuel Chadutaud;Takayuki Kanda;Norihiro Hagita;Hiroshi Ishiguro

  • Affiliations:
  • University of Arizona, Tucson, AZ, USA;Applied Telecommunications Research International, Kyoto, Japan;Applied Telecommunications Research International, Kyoto, Japan;Applied Telecommunications Research International, Kyoto, Japan;Applied Telecommunications Research International, Kyoto, Japan;Applied Telecommunications Research International, Kyoto, Japan

  • Venue:
  • Proceedings of the 2009 international conference on Multimodal interfaces
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Social interactions unfold over time, at multiple time scales, and can be observed through multiple sensory modalities. In this paper, we propose a machine learning framework for selecting and combining low-level sensory features from different modalities to produce high-level characterizations of human-robot social interactions in real-time. We introduce a novel set of fast, multi-modal, spatio-temporal features for audio sensors, touch sensors, floor sensors, laser range sensors, and the time-series history of the robot's own behaviors. A subset of these features are automatically selected and combined using GentleBoost, an ensemble machine learning technique, allowing the robot to make an estimate of the current interaction category every 100 milliseconds. This information can then be used either by the robot to make decisions autonomously, or by a remote human-operator who can modify the robot's behavior manually (i.e., semi-autonomous operation). We demonstrate the technique on an information-kiosk robot deployed in a busy train station, focusing on the problem of detecting interaction breakdowns (i.e., failure of the robot to engage in a good interaction). We show that despite the varied and unscripted nature of human-robot interactions in the real-world train-station setting, the robot can achieve highly accurate predictions of interaction breakdowns at the same instant human observers become aware of them.