Spontaneous spoken dialogues with the furhat human-like robot head

  • Authors:
  • Samer Al Moubayed;Jonas Beskow;Gabriel Skantze

  • Affiliations:
  • KTH, Stockholm, Sweden;KTH, Stockholm, Sweden;KTH, Stockholm, Sweden

  • Venue:
  • Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Furhat [1] is a robot head that deploys a back-projected animated face that is realistic and human-like in anatomy. Furhat relies on a state-of-the-art facial animation architecture allowing accurate synchronized lip movements with speech, and the control and generation of non-verbal gestures, eye movements and facial expressions. Furhat is built to study, implement and validate patterns and models of human-human and human-machine situated and multi-party multimodal communication, a study that demands the co-presence of the talking head in the interaction environment, some-thing that cannot be achieved using virtual avatars displayed on flat screens [2,3]. In Furhat, the animated face is back-projected on a translucent mask that is a printout of the animated model. The mask is then rigged on a 2DOF neck to allow for the control of head movements. Figure 1 shows a snapshot of Furhat in interaction. We will show in this demonstrator an advanced multimodal and multiparty spoken conversational system using Furhat, a robot head based on projected facial animation. Furhat is an anthropomorphic robot head that utilizes facial animation for physical robot heads using back-projection. In the system, multimodality is enabled using speech and rich visual input signals such as multi-person real-time face tracking and microphone tracking. The demonstrator will showcase a system that is able to carry out social dialogue with multiple interlocutors simultaneously with rich output signals such as eye and head coordination, lips synchronized speech synthesis, and non-verbal facial gestures used to regulate fluent and expressive multiparty conversations. The dialogue design is performed using the IrisTK [4] dialogue authoring toolkit developed at KTH. The system will also be able to perform a moderator in a quiz-game showing different strategies for regulating spoken situated interactions.