Predicting evidence of understanding by monitoring user's task manipulation in multimodal conversations

Authors:
Yukiko I. Nakano;Yoshiko Arimoto;Kazuyoshi Murata;Yasuhiro Asa;Mika Enomoto;Hirohiko Sagawa
Affiliations:
Tokyo University of Agriculture and Technology, Koganeishi, Tokyo, Japan;Tokyo University of Technology, Hachioji, Tokyo, Japan;Tokyo University of Agriculture and Technology, Koganeishi, Tokyo, Japan;Hitachi, Ltd., Higashi-koigakubo Kokubunji-shi, Tokyo, Japan;Tokyo University of Agriculture and Technology, Koganeishi, Tokyo, Japan;Hitachi, Ltd., Higashi-koigakubo Kokubunji-shi, Tokyo, Japan
Venue:
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Year:
2007

Citing 3
Cited 1

Human conversation as a system framework: designing embodied conversational agents

Embodied conversational agents
Towards a model of face-to-face grounding

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Processes that shape conversation and their implications for computational linguistics

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics

Latent mixture of discriminative experts for multimodal prediction modeling

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The aim of this paper is to develop animated agents that can control multimodal instruction dialogues by monitoring user's behaviors. First, this paper reports on our Wizard-of-Oz experiments, and then, using the collected corpus, proposes a probabilistic model of fine-grained timing dependencies among multimodal communication behaviors: speech, gestures, and mouse manipulations. A preliminary evaluation revealed that our model can predict a instructor's grounding judgment and a listener's successful mouse manipulation quite accurately, suggesting that the model is useful in estimating the user's understanding, and can be applied to determining the agent's next action.