Human-robot collaborative tutoring using multiparty multimodal spoken dialogue

  • Authors:
  • Samer Al Moubayed;Jonas Beskow;Bajibabu Bollepalli;Joakim Gustafson;Ahmed Hussen-Abdelaziz;Martin Johansson;Maria Koutsombogera;José David Lopes;Jekaterina Novikova;Catharine Oertel;Gabriel Skantze;Kalin Stefanov;Gül Varol

  • Affiliations:
  • KTH, Stockholm, Sweden;KTH, Stockholm, Sweden;KTH, Stockholm, Sweden;KTH, Stockholm, Sweden;Ruhr-Universität Bochum, Bochum, Germany;KTH, Stockholm, Sweden;Institute for Language and Speech Processing, Athens, Greece;INESC-ID, Lisbon, Portugal;University of Bath, Bath, United Kingdom;KTH, Stockholm, Sweden;KTH, Stockholm, Sweden;KTH, Stockholm, Sweden;Bogazici University, Istanbul, Turkey

  • Venue:
  • Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is collected. The corpus targets the development of a dialogue system platform to study verbal and nonverbal tutoring strategies in multiparty spoken interactions with robots which are capable of spoken dialogue. The dialogue task is centered on two participants involved in a dialogue aiming to solve a card-ordering game. Along with the participants sits a tutor (robot) that helps the participants perform the task, and organizes and balances their interaction. Different multimodal signals captured and auto-synchronized by different audio-visual capture technologies, such as a microphone array, Kinects, and video cameras, were coupled with manual annotations. These are used build a situated model of the interaction based on the participants personalities, their state of attention, their conversational engagement and verbal dominance, and how that is correlated with the verbal and visual feed-back, turn-management, and conversation regulatory actions generated by the tutor. Driven by the analysis of the corpus, we will show also the detailed design methodologies for an affective, and multimodally rich dialogue system that allows the robot to measure incrementally the attention states, and the dominance for each participant, allowing the robot head Furhat to maintain a well-coordinated, balanced, and engaging conversation, that attempts to maximize the agreement and the contribution to solve the task. This project sets the first steps to explore the potential of using multimodal dialogue systems to build interactive robots that can serve in educational, team building, and collaborative task solving applications.