From acoustic cues to an expressive agent

  • Authors:
  • Maurizio Mancini;Roberto Bresin;Catherine Pelachaud

  • Affiliations:
  • LINC, IUT de Montreuil, University of Paris8;Royal Institute of Technology, Stockholm;LINC, IUT de Montreuil, University of Paris8

  • Venue:
  • GW'05 Proceedings of the 6th international conference on Gesture in Human-Computer Interaction and Simulation
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work proposes a new way for providing feedback to expressivity in music performance. Starting from studies on the expressivity of music performance we developed a system in which a visual feedback is given to the user using a graphical representation of a human face. The first part of the system, previously developed by researchers at KTH Stockholm and at the University of Uppsala, allows the real-time extraction and analysis of acoustic cues from the music performance. Cues extracted are: sound level, tempo, articulation, attack time, and spectrum energy. From these cues the system provides an high level interpretation of the emotional intention of the performer which will be classified into one basic emotion, such as happiness, sadness, or anger. We have implemented an interface between that system and the embodied conversational agent Greta, developed at the University of Rome “La Sapienza” and “University of Paris 8”. We model expressivity of the facial animation of the agent with a set of six dimensions that characterize the manner of behavior execution. In this paper we will first describe a mapping between the acoustic cues and the expressivity dimensions of the face. Then we will show how to determine the facial expression corresponding to the emotional intention resulting from the acoustic analysis, using music sound level and tempo characteristics to control the intensity and the temporal variation of muscular activation.