Contextual recognition of head gestures

  • Authors:
  • Louis-Philippe Morency;Candace Sidner;Christopher Lee;Trevor Darrell

  • Affiliations:
  • Massachussetts Institute of Technology, Cambridge, MA;Mitsubishi Electric Research Laboratories, Cambridge, MA;Mitsubishi Electric Research Laboratories, Cambridge, MA;Massachussetts Institute of Technology, Cambridge, MA

  • Venue:
  • ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Head pose and gesture offer several key conversational grounding cues and are used extensively in face-to-face interaction among people. We investigate how dialog context from an embodied conversational agent (ECA) can improve visual recognition of user gestures. We present a recognition framework which (1) extracts contextual features from an ECA's dialog manager, (2) computes a prediction of head nod and head shakes, and (3) integrates the contextual predictions with the visual observation of a vision-based head gesture recognizer. We found a subset of lexical, punctuation and timing features that are easily available in most ECA architectures and can be used to learn how to predict user feedback. Using a discriminative approach to contextual prediction and multi-modal integration, we were able to improve the performance of head gesture detection even when the topic of the test set was significantly different than the training set.