Predicting evidence of understanding by monitoring user's task manipulation in multimodal conversations

  • Authors:
  • Yukiko I. Nakano;Yoshiko Arimoto;Kazuyoshi Murata;Yasuhiro Asa;Mika Enomoto;Hirohiko Sagawa

  • Affiliations:
  • Tokyo University of Agriculture and Technology, Koganeishi, Tokyo, Japan;Tokyo University of Technology, Hachioji, Tokyo, Japan;Tokyo University of Agriculture and Technology, Koganeishi, Tokyo, Japan;Hitachi, Ltd., Higashi-koigakubo Kokubunji-shi, Tokyo, Japan;Tokyo University of Agriculture and Technology, Koganeishi, Tokyo, Japan;Hitachi, Ltd., Higashi-koigakubo Kokubunji-shi, Tokyo, Japan

  • Venue:
  • ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The aim of this paper is to develop animated agents that can control multimodal instruction dialogues by monitoring user's behaviors. First, this paper reports on our Wizard-of-Oz experiments, and then, using the collected corpus, proposes a probabilistic model of fine-grained timing dependencies among multimodal communication behaviors: speech, gestures, and mouse manipulations. A preliminary evaluation revealed that our model can predict a instructor's grounding judgment and a listener's successful mouse manipulation quite accurately, suggesting that the model is useful in estimating the user's understanding, and can be applied to determining the agent's next action.